📊Data Acquisition and Management

The foundation of Cryptify AI lies in the accurate and comprehensive collection of data across multiple relevant sources. Our primary objective is to measure the impact of Key Opinion Leader (KOL) and influencer posts on the price movements of cryptocurrencies and tokens across various time horizons. This is achieved by collecting data from three key sources:

  1. Social Media Data Initially, we focus on Twitter, Reddit, and Telegram, given their prominence in the crypto community. We track and analyze metrics such as likes, comments, views, and other engagement statistics at both the post and user levels. Additionally, we employ Natural Language Processing (NLP) techniques to generate embeddings and conduct sentiment analysis on the textual content of posts.

  2. Cryptocurrency Market Data Using APIs such as CoinMarketCap, we gather data on cryptocurrency and token price movements. This includes well-known tokens as well as less familiar ones, ensuring a broad coverage of the market.

  3. External Data Sources To refine our analysis and mitigate the effects of confounding variables, we incorporate additional external data. This helps us isolate the true impact of influencer posts by accounting for external factors that could influence token prices.

All data is ingested using Azure cloud services. We utilize Azure Functions to load data from APIs via Python scripts, Azure Data Factory to manage our Extract, Load, and Transform (ELT) processes, and store the data in Azure Blob Storage and SQL databases. This robust infrastructure allows us to manage and scale our data collection efficiently as we expand our platform to include more social media channels and tokens.

Last updated