Blue Ridge Global

Blue Ridge

Smarter software for demand planning, replenishment, and inventory optimization-with positive ROI in months not years.

WE'RE HIRING! EMEA(Georgia, Romania, and the Czech Republic only)

Senior Data engineer (Remote, PySpark, AWS)

About the Company
Ready to take your career to the next level? Blue Ridge is a leading SaaS company that’s been transforming supply chain planning for over a decade. We help businesses predict demand, manage inventory, and optimize operations using powerful, data-driven insights. From retailers to distributors, our platform delivers AI-powered recommendations, runs advanced machine learning algorithms, and integrates seamlessly with ERP systems.
Our customers consistently rank us as a top performer on G2. In the Winter 2025 Report, Blue Ridge was recognized as the #1 Supply Chain Planning Software Provider in North America, and a leader in Demand Planning and Sales & Operations Planning.

Why Blue Ridge?
Ranked #1 for Best Relationship
Our unique customer success model—LifeLine—sets us apart. It’s more than support; it’s a partnership that helps our clients achieve world-class supply chain performance.
Recognized Leader in Four Categories:

Demand Planning
Supply Chain Planning
Enterprise Supply Chain Planning
Supply Chain Planning – Europe

Momentum Leader in Two Categories:

Demand Planning
Supply Chain Planning

Our Mission
To ensure companies always have the right products, in the right place, at the right time—without relying on guesswork. With 220+ clients worldwide (primarily in the U.S. and Norway) and 20%+ annual growth, we’re scaling rapidly. As part of our continued expansion, we’re growing our R&D team in the EMEA region and looking for a Senior Data Engineer to help shape the future of our platform.

Our Tech Transformation
We are modernizing our data and application layers to enhance scalability, performance, and flexibility:
✔ Apache Spark for high-performance compute processing
✔ Data lake architecture for scalable storage & advanced analytics
✔ Micro front-end architecture for a modular, flexible UI
✔ Drag-and-drop dashboards for fully customizable user experiences

Historically, the company's software was built on traditional technologies like databases, stored procedures, .NET, and C# (currently transitioning to Python + Node.js). As our customer base expanded and data volumes surged, scalability and performance bottlenecks became a challenge.
To address these issues, we explored modern solutions like Amazon EMR and Databricks, ultimately committing to a full-scale migration. This transformation is projected to boost performance by 16-20x, enabling us to process significantly larger data volumes at much higher speeds.
Beyond performance improvements, this transformation unlocks new market opportunities. Industries such as grocery and perishable goods retail require real-time data processing, as demand fluctuates by the hour rather than by the day or week. Our previous system couldn't efficiently support such dynamic requirements. With our new big data infrastructure, we can now process real-time data, positioning us as a competitive player in fast-moving consumer goods (FMCG) markets.
As this transformation progresses, this role has been created to lead the change and drive innovation in our data platform.

Introduction to the Company & Supply Chain Management

Want to learn more about what they do and where we’re headed?

Click here to expand the text with full detailed information about the company, position, and tech transformation that is leading to the opening of this position

Future Vision
Cloud infrastructure is another key focus area. While the company currently operates primarily on AWS, we plan to adopt a multi-cloud strategy, enabling operations across platforms like Azure and Google Cloud, depending on which offers better performance or pricing at any given time. Eventually, we aim to dynamically shift workloads between cloud providers in real time, optimizing costs and performance.

FAQ – Answers to the Most Common Tech-Related Questions - Show more

About the Role
As we transition to a modern big data infrastructure, PySpark plays a critical role in powering high-performance data processing. We are seeking a Data engineer / PySpark expert to optimize data pipelines, enhance processing efficiency, and drive cost-effective cloud operations. This role will have a direct impact on scalability, performance, and real-time data processing, ensuring the company remains competitive in data-driven markets.

You’ll be working closely with a Data Platform Architect and a newly formed team of four Data Engineers based in India (GMT+5:30) and one Data Engineer in Uzbekistan (GMT+5). Additionally, we're planning to hire two more Senior Data Engineers in Georgia later this year.
In this role, you’ll report to the CTO, who is based in the GMT-8 time zone, and the VP of Engineering (EDT/EST).

Primary Goal on the Position
Optimizing and scaling big data processing using PySpark to support real-time analytics and large-scale data workloads.

Position Details

Role: Senior Data Engineer
Location: Remote (We’re looking for candidates based in Georgia, Romania, and the Czech Republic only)
Employment: Service Agreement (B2B contract; you’ll need a legal entity to sign)
Start Date: ASAP
Salary: $5,500 - $8,000 USD per month GROSS (fixed income, paid via SWIFT)
Working Hours: 11 AM to 7 PM local time. No night or weekend work is expected
Time Overlaps: Sync ups with RnD( Puna, India) in GMT+5:30 and devs in GMT-5, plus occasional meetings with the VP of Engineering in EST/EDT and the CTO in GMT-8.
Equipment: The company will provide a laptop.

What You’ll Be Doing

Optimize Data Processing Pipelines: Fine-tune PySpark jobs for maximum performance, scalability, and cost efficiency, enabling smooth real-time and batch data processing.
Modernize Legacy Systems: Drive the migration from traditional .NET, C#, and relational database systems to a modern big data tech stack.
Build Scalable ETL Pipelines: Design and maintain robust ETL/ELT workflows capable of handling large volumes of data within our Bronze/Silver/Gold data lake architecture.
Enhance Apache Spark Workloads: Apply best practices such as memory tuning, efficient partitioning, and caching to optimize Spark jobs.
Leverage Cloud Platforms: Use AWS EMR, Databricks, and other cloud services to support scalable, low-maintenance, high-performance analytics environments.
Balance Cost & Performance: Continuously monitor resource usage, optimize Spark cluster configurations, and manage cloud spend without compromising availability.
Support Real-Time Data Streaming: Contribute to event-driven architectures by developing and maintaining real-time streaming data pipelines.
Collaborate Across Teams: Partner closely with data scientists, ML engineers, integration specialists, and developers to prepare and optimize data assets.
Enforce Best Practices: Implement strong data governance, security, and compliance policies to ensure data integrity and protection.
Drive Innovation: Participate in global initiatives to advance supply chain technology and real-time decision-making capabilities.
Mentor Junior Engineers: Share your knowledge of PySpark, distributed systems, and scalable architectures to help develop the team’s capabilities.

What We’re Looking For

Experience & Expertise:

5+ years as a Data Engineer, with solid experience in big data ecosystems.
7+ years of hands-on AWS experience is a must, including deep familiarity with EMR, IAM, VPC, EKS, ALB, and Lambda.
Cloud experience beyond AWS (GCP or Azure) is a strong plus.
Proficiency with Python (including data structures and algorithms), SQL, and data modeling.
Strong expertise in distributed computing frameworks, particularly Apache Spark and Airflow.
Experience with streaming technologies such as Kafka.
Proven track record optimizing Spark jobs for scalability, reliability, and performance.
Familiarity with cloud-native ETL/ELT workflows, data sharing techniques, and query optimization (e.g., AWS Athena, Glue, Databricks).
Experience with complex business logic implementation and enabling application engineers through APIs and abstractions.
Solid understanding of data modeling, warehousing, and schema design.

Soft Skills:

Strong problem-solving skills and proactive communication.
Fluent English - B2 and higher (both written and verbal).

Preferred Skills & Certifications:

Familiarity with .NET applications structure and deployment.
Relevant cloud certifications (AWS Solutions Architect, Developer, Big Data Specialty).
Certifications or proven experience in Databricks, Apache Spark, Apache Airflow, and data modeling are a plus.

Recruitment Process
# 1 Initial Interview: Up to 1 hour with HR or/and including a self-assessment form (Click to fill out the form). If you prefer, you can skip the call and discuss all questions and details in writing instead. Just let us know!
# 2 Managerial Interview (Optional): 30-60 minutes (You will meet with the CTO to learn more about the company, the position, and future plans directly from the source.)
# 3 Test Assignment: up to 113 minutes on iMocha platform (Data Structures - Graph data structure, Array and String manipulation - All in Python, with a few MCQ questions on Spark)
# 4 Technical Interview: Platform/Application Architect: up to 1h
FAQ – Technical Interview Format + Key Domains Covered - Show more
# 5 Offer & Paperwork: Up to 30 minutes with the CTO to finalize conditions and complete necessary paperwork.
# 6 Onboarding: Get ready to join the team and start your journey!

Got Questions or Interested? Let’s Connect!
If this role sounds like a great fit for you, or if you have any questions, let’s schedule a 30-minute exploratory call this week.
We’d love to chat about how your skills align with this exciting opportunity!