Our client need java developers to boost their Site Reliability Engineering (SRE) team, working on a roster of initiatives: Database usage optimization, streaming, automatic degradation, migrate data models to more effective formats, introduce asynchronous decoupling, sharding data models, shift to eventual consistency patterns, etc.
This means developers with strong experience in optimizing large distributed systems, and experience in doing iterative hidden rearchitecting.
We are specifically looking for people who have:
- Strong development experience in Java
- Strong experience in optimizing database systems ( have databases with billions of rows) - specifically postgresql
Strong experience in investigating cascading failures in distributed systems
- Strong communication skills
- Experience with distributed datastores including: Elasticsearch, Riak, Cassandra, Redis, S3
- Experience with large migrating efforts running over longer time without visibility or downtime for users.
- Experience designing and running iterative and staged rollout of transformations to architecture.
- Experience with top down troubleshooting (from high level metrics to tracing and aggregating application/host/networking traffic logs)
- Experience with AWS / Cloud, microservices, docker, kubernetes or similar
- Tracking tools experience or ideas of how to go about error tracing
6 months + extension
Min. 5 years of professional IT experience.