- Case Study -
Scrape custodian emails for up-to-date holdings
Large Chicago asset manager.
The client had 7 custodians who delivered holdings data by e-mail only, no API, no FTP, no DB.
Polled for new Office 365 e-mails using Apache Airflow and ingested live/historic data with 100% accuracy.
We provide a wide array of cutting-edge services required to fully build a fast data pipeline.
We offer a "simple" approach to designing so called complex systems with an emphasis on Design and DevOps.
From brittle to resilient
Scraping e-mails isn't a "bad" thing if that's the only way forward, and in this case it was, Advanti took a process that "broke" 1-2x a week and made it with 99.999% up-time. There were 780 quality hours a year saved from hands-on-keyboard break/fix/test/deploy the customer had to do. From the very start, we designed test cases for each scenario, automated the CI/CD process using Git, investing time in DevOps upfront saves in the long-run.
Automated and completely serverless
We used a serverless architecture to only pay-as-you-go, managed Apache Airflow for orchestration, YML-ized controls for data stewards, and future-proofed scraping process such that changes to the data structure can occur without breaking and new items automatically acquired and published into the customers data lake.
Accurate and precise holdings data
Advanti captures every single e-mail, and key step of scraping the data, ensuring that there is no loss of data fidelity along the way. It's absolutely critical that the data remains in-tact, but in a very usable for for consumption in the client's portfolio management dashboard, as well as used by the trading and research teams.
Interested in Email Scraping?
No problem, reach out to us, and we're happy to talk about email scraping in depth, and how we've helped the world leading companies develop solutions they love to use!
Talk to us