Big Data Montreal #31 – Tuesday December 9th 6:30pm at the RPM Startup Centre

Tickets: bdm31.eventbrite.ca

Big Data Montreal would like to invite you to its thirty-first meeting!

Join us on Tuesday December 9th at 6:30pm to attend a conference, as well as to network with other Big Data enthusiasts from Montreal!

The meeting will take place at the RPM Startup Centre, which is located at 420 Guy street.

All are welcome, no matter if you already have some experience with Big Data technologies or if you’re simply curious to learn more.

We have two scheduled presentations:

  • Spying on Hadoop with strace by Julia Evans, Machine Learning Software Engineer at Stripe
    Do you feel like you totally understand all the internals of the Hadoop ecosystem? How HDFS works? (I sure don’t!) Learning a little more about internals can help you use existing tools better, make appropriate architecture choices, and write better-performing jobs. To understand what’s going on, we’ll spy on exactly what information gets transmitted over the network with strace, and talk a little about how we can use that understanding to write smarter map/reduce jobs. You’ll come away understanding HDFS better and with some fun things to try out.
  • Introduction to Spark and MLlib by Reza Zadeh, consulting professor at Stanford and Technical Advisor to Databricks
    As computer clusters scale up, data flow models such as MapReduce have emerged as a way to run fault-tolerant computations on commodity hardware. Unfortunately, MapReduce is limited in efficiency for many numerical algorithms. We show how new data flow engines, such as Apache Spark, enable much faster iterative and numerical computations, while keeping the scalability and fault-tolerance properties of MapReduce. In this tutorial, we will begin with an overview of data flow computing models and the commodity cluster environment in comparison with traditional HPC and message-passing environments. We will then introduce Spark and show how common numerical and machine learning algorithms have been implemented on it.

And a few flash presentations.

N.B.: This edition will be in English.

Finally, you are also welcome to join us at the nearby Brasseurs de Montreal, after the presentations, for some casual networking (please use the appropriate ticket so we know how many people to expect).

Please tell your friends and colleagues :) !

P.S.: We are (as always) looking for speakers for future editions of BDM, so if you’re interested in presenting (or if you know people who are), please don’t hesitate to write to us at bdm-admin@googlegroups.com :) !


Billets: bdm31.eventbrite.ca

Big Data Montréal vous invite à sa trente et unième rencontre!

Joignez-vous à nous le mardi 9 décembre à 18h30 pour assister à une conférence, ainsi que pour réseauter avec les autres enthousiastes montréalais du Big Data!

La rencontre aura lieu au RPM Startup Centre, qui est situé au 420 rue Guy.

Tous sont bienvenus, que vous ayez déjà de l’expérience avec les technologies de Big Data ou que vous soyez simplement curieux d’en apprendre plus.

Nous avons deux présentations à l’horaire:

  • Spying on Hadoop with strace by Julia Evans, Machine Learning Software Engineer at Stripe
    Do you feel like you totally understand all the internals of the Hadoop ecosystem? How HDFS works? (I sure don’t!) Learning a little more about internals can help you use existing tools better, make appropriate architecture choices, and write better-performing jobs. To understand what’s going on, we’ll spy on exactly what information gets transmitted over the network with strace, and talk a little about how we can use that understanding to write smarter map/reduce jobs. You’ll come away understanding HDFS better and with some fun things to try out.
  • Introduction to Spark and MLlib by Reza Zadeh, consulting professor at Stanford and Technical Advisor to Databricks
    As computer clusters scale up, data flow models such as MapReduce have emerged as a way to run fault-tolerant computations on commodity hardware. Unfortunately, MapReduce is limited in efficiency for many numerical algorithms. We show how new data flow engines, such as Apache Spark, enable much faster iterative and numerical computations, while keeping the scalability and fault-tolerance properties of MapReduce. In this tutorial, we will begin with an overview of data flow computing models and the commodity cluster environment in comparison with traditional HPC and message-passing environments. We will then introduce Spark and show how common numerical and machine learning algorithms have been implemented on it.

Et quelques présentations éclair.

N.B.: Cette édition sera en anglais.

Finalement, vous êtes invités à vous joindre à nous aux Brasseurs de Montréal, où la soirée se poursuivra après les présentations, pour continuer à réseauter (svp, utiliser le billet approprié pour que nous ayons une idée du nombre de personnes)

Passez le mot et venez en grand nombre :) !

 

P.S.: Nous sommes (comme toujours) à la recherche de présentateurs pour les éditions futures de BDM, donc n’hésitez pas à nous écrire à bdm-admin@googlegroups.com si vous êtes intéressés à présenter, ou si vous connaissez des gens qui le sont :) !

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>