Real Time Finance Data Feeds

Site-specific crawl and extraction

Do you like this Client's Case?

Want to see more?

Client

Financial research company signalling investment options

Domain

Finance, Stock Market

Solution

Crawling Big Data King, treat by Artificial Intelligence Scoring

Huge number of Sources Sites, in Unknown Formats…

The client wanted to blend market and social data in order to build a financial recommendation engine that could deliver proprietary signals for systematic equity investing.

It had a subscribed audience base to which it would send these trade signals on a daily basis.

In order to get this solution running, the company wanted all the possible data about stocked companies, that were hot on news, blogs, articles and social media. Such data would facilitate the intelligence ranking system that the client was building internally building.

Scale was the primary attribute in picture because the number of sources involved was huge – in the range of 30,000 + websites.
Availability and velocity of the data and its coverage were important too, considering the dynamism in the finance market.

Artificial Intelligence Scoring

Scrapy Ninja’s platform along with its mass scale and low-latency crawling and scraping was used to address this problem.

The system was tuned to turn into this scale and adaptively crawl sources based on frequency (very active sources vs. nearly dormant ones).

Alerts were employed in order to notify about dead sources, so crawl results were quickly very accurate and the whole system was more efficient than what the client was requiring. In order to address the low latency requirement of 5-10 minutes, few components were added that could live up to such computation power.

The crawled data was indexed using hosted indexing component and plugged into Artificial Intelligence API, to identify scoring and positive/negative information of the crawled information every few minutes.

Final results were provided to client information system in JSON format.

Benefits for Client

  • 100% API availability and continuous data feeds
  • Dynamic list of keywords and sources
  • 100% API availability and continuous data feeds
  • Zero data processing efforts at client’s end
  • Scalable infrastructure reduced client’s costs
  • Artificial Intelligence and Deep Learning analysis
  • Client’s workload only focused on querying internal datasets and running analysis

850000+
Scraped Items / Day
80+
Happy Client
1900000+
Crawled Pages / Day

Get immediate Quotation with your specific Requirements

What Data would you like today?

Please enter the websites that you want to scrape. We will email you a quote and get started on payment.
– Yes, it is really that simple.

Simple Quote Process

Just a few quick, and you will have described your need. We will answer you in less than 6 hours, and can usually start the work in less than 48 hours.

Let us know your Needs

Contact Us