WHUG #34: „Apache Spark vs rest of the world” and „Elephants in the cloud”

28 maj 2018@18:00 – 20:00
Stefana Banacha 2
02 Warsaw

Wydział MIMUW (sala 5440)
Banacha 2 · Warsaw

We are happy to invite you to the 34th meetup of WHUG. We will be pleased to host Arkadiusz Jachnik from Agora S.A and Krzysztof Adamski from ING S.A./GetInData. Below you will find some details

1) Apache Spark vs rest of the world – Problems and Solutions

Apache Spark is a great solution for building Big Data applications. It provides really fast SQL-like processing, machine learning library, and streaming module for near real time processing of data streams. Unfortunately, during application development and production deployments we often encounter many difficulties in mixing various data sources or bulk loading of computed data to SQL or NoSQL databases. All in all, there are a lot of challenges at the confluence of Apache Spark and the rest of the Big Data world, including HBase, Hive, PostgreSQL or Kafka. Those are the issues that I will discuss in our presentation.


Arkadiusz Jachnik
Mr. Arkadiusz Jachnik is a Senior Data Scientist at Big Data Department of Agora S.A. – one of the biggest media company in Poland. He is currently working on real-time user profiling system and highly scalable recommendation platform. He received his BSc and MSc in Computer Science at the Poznan University of Technology. Mr. Jachnik is an author and coauthor of several machine learning publications. His current research activity concerns multi-label classification and multi-output prediction.

2) Elephants in the cloud or how to become cloud ready

The way you operate your Big Data environment is not going to be the same anymore. This session is based on our experience managing on-premise environments and taking the lesson from innovative data-driven companies that successfully migrated their multi PB Hadoop clusters. Where to start and what decisions you have to make to gradually becoming cloud ready. The examples would refer to Google Cloud Platform yet the challenges are common.


Krzysztof Adamski
Krzysztof started working with Hadoop in 2012 at a high-frequency trading company in Amsterdam, then joined ING focused on making big data solutions secured to meet banking standards. In 2017 at Getindata he supported Hadoop at Spotify being part of one of the biggest transition to Google Cloud
