Archived on

Common Crawl Foundation is hiring a
Engineer

About Common Crawl Foundation

The Common Crawl Foundation has a 17-year-old, 8 petabyte crawl & archive of the web. Their open dataset has been cited in nearly 10,000 research papers and is the most-used dataset in the AWS Open Data program. The organization is also very active in the open source community.

Job Description

We are expanding our engineering team and looking for people who are excited about our non-profit, open data mission. Candidates should be proficient with Python, and hopefully also some Java, and proficient at cloud systems such as Spark/PySpark. Our current team is composed of engineers who do some data science, and data scientists who do some engineering. We are focused on improving our crawl, making new data products, and using these new data products to improve our crawl.

Remote

Salary

Not Specified

Benefits

Not Specified

Tech Tags

Apache SparkCloud SystemsJavaPysparkPython

Part Time

Date Listed

01 November, 2024 (over 1 year ago)

Share this job

X / Twitter LinkedIn

This job is archived, but you can still apply.

Hiring engineers?

Reach thousands of tech candidates from the Hacker News community.

Post a Job — $99

Common Crawl Foundation is hiring aEngineer