vim89 / datapipelines-essentials-python

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Date Created 2019-11-16 (5 years ago)
Commits 15 (last one about a year ago)
Stargazers 54 (0 this week)
Watchers 4 (0 this week)
Forks 38
License apache-2.0
Ranking

RepositoryStats indexes 640,560 repositories, of these vim89/datapipelines-essentials-python is ranked #470,117 (27th percentile) for total stargazers, and #368,371 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #92,106/131,281.

vim89/datapipelines-essentials-python is also tagged with popular topics, for these it's ranked: python (#18,797/23610),  python3 (#3,339/4344),  xml (#499/590),  spark (#471/559),  big-data (#327/372),  etl (#234/288),  hadoop (#161/188),  pyspark (#86/118),  apache-spark (#93/115)

Other Information

vim89/datapipelines-essentials-python has Github issues enabled, there is 1 open issue and 0 closed issues.

Star History

Github stargazers over time

606050504040303020201010002021202120222022202320232024202420252025

Watcher History

Github watchers over time, collection started in '23

66665555554444May '23May '23Jul '23Jul '23Aug '23Aug '23Oct '23Oct '23Nov '23Nov '2320242024Feb '24Feb '24Apr '24Apr '24Jun '24Jun '24Jul '24Jul '24Aug '24Aug '24Oct '24Oct '24Nov '24Nov '2420252025Feb '25Feb '25Apr '25Apr '25

Recent Commit History

1 commits on the default branch (master) since jan '22

1111110.50.5000000Jun '23Jun '23Jul '23Jul '23Aug '23Aug '23Oct '23Oct '23Nov '23Nov '2320242024Feb '24Feb '24Apr '24Apr '24Jun '24Jun '24Jul '24Jul '24Aug '24Aug '24Oct '24Oct '24Nov '24Nov '2420252025Feb '25Feb '25Apr '25Apr '25

Yearly Commits

Commits to the default branch (master) per year

1010998877665544332211002019201920202020202120212022202220242024

Issue History

Total Issues
Open Issues
Closed Issues
1111110.50.500000020222022Jul '22Jul '2220232023Jul '23Jul '2320242024Jul '24Jul '2420252025

Languages

The primary language is Python but there's also others...

PythonPythonShellShellHTMLHTML

updated: 2025-04-14 @ 01:41am, id: 222109927 / R_kgDODT0g5w