Salary Distributions

I would like to start from the salary distribution in Poland. I think it is important to know how much you can earn in IT. I will show you the distribution of salaries in Poland and in the world. What is important, salaries are quite inflated cause of the big range of requirements on postion.

Starting from juniors, we can see that average salary on contract of employment is bigger that on the B2B, I don't exactly know why but I suppose that there was not to many offers on UoP for them and a average salary was bigger. The situation normalizes for mids and senior, we have bigger salary on B2B than on the UoP. What can be funny that the lowest salary for all of them was for mid on B2B was around 3100 PLN.

How do you work in IT and who are employers looking for?

Juniors

There we can see very important information that there are few offers for juniors. What can complicate their situation and also it confirms that currently getting a job can be challenge.

Mids/Seniors

Mids also known as regulars are in pretty good situation right now. They rather they will find a job cause there are around 40% of all offers like seniors.

Work modes

It can be really interesting because most offers are for remote work but it comes from that most offers are for quite advanced programmers.

Most popular technologies and programmers friendly cities in Poland

Cities

Most popular is of course capital of Poland Warsaw, next are Krakow and Wroclaw. There are no many changes over the years. These cities are most popular among Software engineers.

Techs

The most popular in offers was SQL more generally relational databases, the next was Docker and Kubernetes and on the third place was Python. The second one comes from that most offers were for seniors and they usually need to know these kind of tools.

Models results - accuracy

Model metrics

After learning I've got GradientBoostingRegressor returned as the best. The other one has very similar errors and r2 and the process of learning was much faster than winner one. Error like rmse or mae are quite big but it comes from little amount of offers with very different variables. Moreover some of the techs appeared mainly for mids like Linux and C.

As we can see, we have chart which simply show dependency between test and predicted values. In this case x is my test value, y is predicted. The more the results are concentrated around the red line, the better the model determines the predicted values. What's more, a model predicts slightly inflated salaries but I think that I had in my dataset really various salaries for juniors or I didn't match my technologies enough.