airscholar / RealtimeStreamingEngineering

This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data acquisition, processing, sentiment analysis with ChatGPT, production to kafka topic and connection to elasticsearch.

Date Created 2023-10-28 (about a year ago)
Commits 2 (last one about a year ago)
Stargazers 29 (0 this week)
Watchers 3 (0 this week)
Forks 20
License unknown
Ranking

RepositoryStats indexes 579,555 repositories, of these airscholar/RealtimeStreamingEngineering is ranked #560,031 (3rd percentile) for total stargazers, and #420,420 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #110,526/115,123.

airscholar/RealtimeStreamingEngineering is also tagged with popular topics, for these it's ranked: chatgpt (#2,305/2496),  kafka (#814/837),  elasticsearch (#728/742),  openai-api (#435/479),  apache-spark (#105/108)

Other Information

airscholar/RealtimeStreamingEngineering has Github issues enabled, there is 1 open issue and 0 closed issues.

Homepage URL: https://www.youtube.com/watch?v=ETdyFfYZaqU

Star History

Github stargazers over time

Watcher History

Github watchers over time, collection started in '23

Recent Commit History

2 commits on the default branch (main) since jan '22

Yearly Commits

Commits to the default branch (main) per year

Issue History

Languages

The only known language in this repository is Python

updated: 2024-10-29 @ 04:52pm, id: 711152714 / R_kgDOKmNUSg