Data Platform Architecture, Evolution, and Philosophy at Netflix – @Scale 2014 – Data

hello everybody okay let’s get started I’m Kurt Brown I manage the data platform team at Netflix and today we’re going to talk about the architecture of the data platform a bit about our philosophy and then we’re gonna do a deep dive into mantas which is a reactive stream processing system that we’re building in Netflix and Justin Becker is gonna take over and talk about that so first I want to started with what’s different in Netflix than most places so some of you might be using some of this stuff I saw on the keynote everyone is sort of bashing the cloud that’s one of the big things that’s different about Netflix so let’s dive into those so the biggest thing is virtually all of our infrastructure is in the AWS cloud so we were in the data center way back when but almost nothing is there by the end of the year we expect nothing to be left in the data center that includes our billing systems our analytics systems etc so it’s a big differentiator versus a lot of really at scale companies and we’re in at scale company as well not as big a differentiator versus some smaller startups another big difference versus a lot of big in analytics companies is that we use s3 as our central data hub so a lot of people either have their own remote object store or file store or they use HDFS very heavily we’d actually don’t use HDFS all that heavily it’s part of the the plumbing of some of our jobs but it’s not our persistence storage layer for Hadoop we also use EMR for Hadoop so we’ve looked at all the distributions cloud era map our Hortonworks rolling our own but the end of the day since we’re in amazon and we

have a good relationship with the EMR team elastic MapReduce it’s worked really well partnering with them so tying into the fabric of Amazon having a good relationship and it gives us things like one thing we negotiated is a site license with them which is now publicly available and that gives us a lot of great capabilities like we don’t think about cost which we had to think about before if we went with a vendor and they try to charge us by node and then there’s the elasticity of Amazon and we just have to worry about it we’ve already pre reserved a lot of hardware we can just we just use it we don’t have to worry about any costs at this and then the other thing which gets into the philosophy section is freedom and responsibility and I’ll dive pretty deep into this but it’s a little different culture at Netflix what we try to do is really enable someone to do whatever they want to do a certain degree we want you to act in Netflix as best interest and then you’re responsible for your actions so it’s just expecting a level of maturity on the part of developers and then the the benefits you get out of it is you don’t have so much processing rules to put up with on the way so first diving into architecture I want to show you way back circa 2009 this is what our bi stack looked like it looks a bit like a banking stack if you are in bi and that’s because we had a couple of bankers that came in and started our bi group needless to say they’re not at Netflix anymore and our stack looks very very different at this point so what does it look like today so the first thing is s3 as I said is our central data hub that’s where all the data is going in and out of it’s sort of the plumbing and the fabric that glues everything together the first part of it is our massive event pipeline sort of like the scribe or a flume that a lot of you guys might be using it’s called Cyril it’s open source it processes about 350 billion events per day that’s things like video quality what are people searching for UI interactions and the like and it’s it’s a pretty nifty little pipeline you basically just throw some key value pairs in there you can annotate the object and then if a hive table doesn’t exist it will automatically create it for you it’ll put all those key value pairs in a map filled with in hive or you can config and blow it out into first-class columns so you don’t have to think a whole lot about it you just log it and it’s available again like a lot of technologies you guys are using now that covers the event part of the pipeline but what about the dimension part of the pipeline so you saw Oracle in the previous picture very little of that left in our infrastructure at this point and Cassandra’s worked really well for Netflix it’s scales very very well it spans multiple data centers a very fast reads very fast right so it’s great for everything except for what my group does it was a real penis actually for what my group does and what that is is we need to get all the data how you need to get the data into s3 because we want to join it with all this rich event data do our analytics and when we initially talk to Kassandra folks are like oh just do the analytics and Cassandra it’s like oh that’s well and good but we have 350 billion events not that many of the time that we need to join together with this stuff so we need to get the data out so a couple engineers on my team at the end of the day we’re like we don’t really care about Cassandra for getting the data we just want the data so we have an open-source tool called priam which automatically backs up the SS tables underlying Cassandra puts it on s3 we load it up in a Hadoop cluster we take the three copies of the data and Cassandra crunch it together take the most recent time Stan repurchased persisted in JSON format and s3 and now our ETL developers have something they can work with they don’t worry about any of the idiosyncrasies of Cassandra and what do they do with it then they use plague in Python which is our ETL framework of choice they can translate any of the data they want they can filter it they can aggregate it they can join it to Cirro and then they can reaper sis that data back in this three so that analytics can happen quickly hi this a big part of the mix was actually predated big in our infrastructure so this is where most of our engineers will do their clearing and it’s slow as you guys well know so that’s a lot not all not all well and good and that led to presto so we we released this into our environment about six months ago and it has been awesome for us so we work pretty closely with Facebook on this contributing back and partnering of things like ODBC drivers and park’ format and working with s3 was a big thing for us obviously since it’s not something Facebook had to deal with but something we had to deal with now despite that despite these great cloud technologies we do have some traditional data warehouse type databases in our mix Terra data you can see the little cloud around it so we had it in our data center we wanted to be out of the data center and we pioneered with Teradata a cloud offering on their part so we’re the first so it’s it’s you know not full-scale cloud for for what we’re using but they’re getting better and better at the cloud which might lead to what about redshift if we’re in Amazon why aren’t we using redshift we actually are using redshift quite a bit so that’s another big data analytic database we’re using it for a third party CS application where we don’t want to co-mingle our backends and somehow have some way that they could get into Teradata and see some data even though

don’t put PII data in there we just don’t want our third parties to be able to get our really rich sources of data and also our engineers and our algorithms teams will use red chef’s to do come scratchpad type work like joined data together and we’re hoping over time that the race between redshift and Terra data continues and they get better and better and better and we have this sort of hybrid environment that you see so what does it go from here it feeds into MicroStrategy so that’s that’s still part of our mix that’s our enterprise reporting tool so most of our daily dashboards and our email reports come out of MicroStrategy more recently tableau has gained a foothold a lot of people like that very easy to get started you don’t have the heaviness of MicroStrategy but then you also lack things like the rich metadata layer so there’s a lot of trade-offs along the way and recently you know partnering with Facebook we also got the ODBC driver working between presto and tableau so we’re using that internally now and then lastly we have one other reporting tool called sting so you can see we have all suite right here and this was in response to a lot of teams within Netflix we’re starting to build their own reporting layers on top of our big data data warehouse so on top of hive in many cases and it just felt very inefficient so my team said okay let’s just build something with a service-oriented architecture you can throw your own UI on top if you want and what if lets you do is take the results of a hive query load it in a memory slice and dice it filter it so one theme you can see from all these technologies is that you know from Netflix’s perspective hybrid is really the way to go like a lot of companies say I’m just gonna use high for everything like users are gonna kill you if you do that from a performance perspective or a lot of times people say I just want to use one reporting tool which can make sense if you can get away with it like less is definitely better but if there’s something really differentiating that you need to drive your business and you need a hybrid environment so you saw we have MicroStrategy we have tableau we have sting they all serve different purposes there’s an expense to maintaining those but they help drive our business forward and on the datastore even more so we have presto we have hive we have Tara data we have redshift but again we’ve very much rationalized why we’re using what we’re using things for presto for example was was a no-brainer for us it shares the same metadata as high shares the same data as high for us especially since s3 is our back-end so it’s a very very low-cost system for us with performance in the 10 X to 50 X type range so as I mentioned s3 is our central data hub data is going in and out all the time but to really why would we do it so some of the great things about s3 are durability so you get 11 nines of durability it’s a lot better than you can get in your data center for the most part and it’s very very scaled out as well so you can actually get performance benefits in some cases using s3 we also have the ability the ability to undo changes so we used version buckets within s3 so someone deletes something by accident we can undo it so that that happens every so often but really to get the advantage out of s3 you need to sort of abstract the people away from it and to get take advantage of this rich elastic ecosystem in Amazon and EMR we created something called Genie and what this is is API is that you can interact through and you don’t don’t worry about what are all the resources behind the scenes like what Hadoop cluster is running is it there anymore is it going away am i expand again am i shrinking it by doing a red-black push so where Genie comes in is this is what you interact with as a developer you say I’m going to submit my job through Genie and it’s going to take care of things behind the scenes and it’s hopefully relevant to you guys because it is it is open sourced we actually internally just launched Genie 2 which is a more generalized execution framework so instead of just being Hadoop it can execute against any back-end framework so who knows what it will be used for in the future may Saussure with spark or we’ll figure it out as we go but what does it do for you so here’s a classic example swap out s3 for HDFS but perhaps in your environment you have Hadoop running everything’s well and good and then badness strikes so your Hadoop cluster stops performing something’s really wrong oh crap your data in HDFS you can’t just get rid of your cluster there’s a lot of scrambling for us this is the scrambling that we do we just spin up another cluster and then we kill our old cluster if the old cluster is sort of you know on its last legs but hasn’t completely died then we’ll keep it running but the long-running jobs finish and then when those are done then we’ll spin the cluster down in the meantime all new jobs are going to this new cluster and then the same paradigm could be as I mentioned before red black pushes whenever we’re doing an upgrade spin up a new cluster let the old jobs finish on the old cluster and then you go on your merry way with this new cluster taking this further we don’t have to do it just for red black pushes or when there’s problems we can also do it for just our daily operations so our nightly batch processing our real production heavy jobs they run on a high SLA cluster and then it sort of cowboy ad hoc whatever you want to do that’s on our query cluster the nice thing is they’re

sharing the same data in s3 and we can have as many clusters as we want we’re not to worry about contention on HDFS or you know in theory that the priority schedulers could handle more efficient use but they’re just not that good in many cases so this gives us de facto isolation between our SLA jobs and our query jobs so another nice benefit being in Amazon is that we we have something called bonus clusters as well so I mentioned before that we have an EMR site license so we don’t have any additional expense for that we’ve got pre reserved a ton of hardware because we’re streaming videos all the time but people go to sleep at a certain point thank God and we can use that hardware so and at night when a lot of our batch jobs need some hardware we’ve got a lot of hardware sitting around at Netflix that we’ve already paid for most people at Amazon actually reserved 24/7 at a certain scale it’s not as elastic as people think it can be but it’s expensive to be so we just borrow that hardware and then we spin up three bonus clusters in different availability zones or data centers and then we just churn through our batch jobs and then when we have to give it back to production we give those machines back and then they continue using them and if you know the nice thing about Genie is if we have to give those machines back before already we can just repoint even though someone pointed their job at a bonus cluster we repointed a hiya so a cluster and then any new jobs will run they’re harder to do if you’ve got it like hard-coded push to one particular cluster now it’s not all peaches and roses unfortunately so s3 does have eventual consistency to deal with just part of the cap theorem something to deal with and we struggle with it for a while the worst part about it is it doesn’t happen very often but it creates a crisis of confidence in your users in some cases like hey I know I the data there and the next job try to read it and it’s not there I just don’t trust anything anymore so after some some clever engineering work some folks on my team came up with something called semper which is uses aspect-oriented programming to basically wrap the calls to the file system so whenever something is written to s3 or read from s3 we also write to a consistent index store so we’re currently using DynamoDB but we could use anything and then in addition to hey I’m checking is that data in s3 it also checks against dynamos index and says is it there and if it is in dynamo and it’s not an s3 then you know that there’s some eventual consistency so it hasn’t caught up or maybe it’s been deleted but it hasn’t fully disappeared from the index in s3 so what this lets us do is have a fully configurable system where we say hey if we see these inconsistencies we can just wait and by default we wait for fifteen minutes we can fail the job after fifteen minutes where we can say go on your merry way the developer has total control for every job how they want to deal with eventual consistency and at a certain point we just say ok trust s3 we don’t use this this consistent index forever it should catch up at some point but it’s been a really right nice shim layer that lets us trust our data more and this is open source as well if you’re using Amazon so that’s what we’re doing now but what’s in the works so faster and easier those are always things that people know and love so some cool things that we got in the works right now Hadoop – we recently upgraded to you I imagine some of you guys as well have as well again not all not all warm and fuzzies there but we got it working dealt with a bunch of bugs worked with Amazon put some patches back into open source and it does in and of itself it doesn’t give you huge performance improvements but it gives you flexibility like better control of memory and the like for MapReduce jobs but it is a foundation but you can build on top of so some of the other projects are Parque since we is pig and hive so heavily we’re not using orc files right now but this is a columnar format great compression a rich ecosystem around it and we’ve done a fair bit of work especially getting it working with s3 and with presto and I mentioned presto before we’ve been really happy with it whenever I hear someone says they’re just gonna use hi for all their data warehousing like oh my gosh your users are going to revolt and they should but if you throw in something like presto or verdict or Teradata or something else to sort of ease the pain for the quick queries or if they’re reporting tools then it can go really far so you know we haven’t moved fully to presto and we still are using traditional databases but it’ll be interesting to see how the ecosystem develops over time pig antes we did quite a bit of contribution to pick on tazed the only problem we hit is that we kind of got ahead of phase or test ending on your pronunciation because a lot of the backend stats are just not available so we got to the point where our developers are getting frustrated cuz they don’t have a visibility they had in MapReduce one Hadoop one with fig so we put it on pause for now to let tez catch up and then you know we might resume in the future finishing the work on pig intestine and then SPARC we looked at it in 2013 we could tip it over by blowing on it at the time so didn’t work at any sort of scale or multi-tenancy but obviously the community has exploded since then data bricks you know broke out into its own company so we’re quite interested to see where this goes and we’re investigating putting it back in our stack the various teams and Netflix are looking at it for

stream processing for machine learning and then for batch analytics on the easier front we have a lot of open source tools out right now this is a new one called in visa which will be open source in the next couple of days or next couple weeks and what this does is it uses elastic search to get capture all of your data about your Hadoop ecosystem and very easily search on it or visualize and see performance issues and just how are things going without having to track through all these logs all over the place which is particularly challenging somewhere like Netflix where we have machines that disappear after certain periods of time so we need to capture that rich metadata summer so that we can use it whenever we need it so here a search for a particular job you see one job started finished the clusters they’re running on the G name and there’s a genie idea that you can link to and from that you can get really rich data about what’s happening in your MapReduce jobs so if the top you see sort of a swim lane which shows the parallelization of your job so all in one view this is the layout of a particular Pig job how it’s playing out over time and then below it you can see each task how its operating along with rich counters and it’s just all consolidated in one place and you can see things like oh maybe there’s a big skew on a particular task or there’s a flaky machine or the like so it’s basically just bringing it all together so look for a tech blog coming out soon and then you’re welcome to try it it should be really easy to install in your environment another benefit of it shows cluster utilization so I highlighted one particular user you can see how much of the cluster they are using and you can see in the bottom are a backlog over time on our goop infrastructure so this helps us if someone says oh I can’t get resources we can very easily see is it because of when you were running it maybe you can run this backfill job at a different time and use our resources more efficiently so next is easier fund we are putting out something called a Big Data API otherwise known as cradle so kragle is from the lego movie if you seen it where they glue everything together just kind of antithetical to an API but it’s trying to be about cohesion is the theme here and we have tons of tools and they were all custom written in their own way some are open-source franklin’s our metadata store I guess is to get data to Cassandra Charlotte dependency analysis transport moving data lipstick is for our Pig workflows visualization and monitoring and then etc etc etc but what the Big Data API does is a common sort of syntax in Python that anyone can use we built we’re going to build all of our tools on top of this and very easily you can just you know for example run your hive jobs specified here get the details of your job see the results all programmatically so within my team the data platform team we can build on top of it or anyone within data science or algorithms teams at Netflix can use this since there is still early stages here’s another example where you know it’s interacting with franklin code here and then it’s using transporter to move the data from franklin into hive tables so you can look at the slides online afterwards really go see that code and then the culmination and this is going to be the big data portal the first rather is very very simple it’s just a query interface on top of hive press so we added to the mix but the goal here is not just to be a query on your face it’s like this is where you go if you’re in data science and engineering you want to know what’s happening in your cluster just go here you’ll be able to query your data you’ll be able to see rich information about the cluster anything that sort of top-level I just want to know what’s happening you go to this big data portal and then we’ll link off to our individual tools that are written top of a Big Data API if you want to do deeper dives so this is a pretty exciting project that we’re really excited to see where it goes within Netflix so have a few minutes left when I burn through philosophy very very fast so how do we do what we do within Netflix and I’m sure some of you guys do a lot of you things as well if not might have some some neat things you can take back to your companies so one of my favorites is the Netflix expense policy this is at five words acting Netflix is best interests so it’s you know why do I care about expense policy in a presentation like this really it’s it’s a North Star I would say within Netflix is that it’s like act do what makes sense for Netflix so you know what we don’t need to say you know we could spend $400 in a hotel and you can’t fly first-class or business class it’s like just do what makes sense like if you’re gonna do a long business trip and you really need to work and you really need to five business class to Europe you know that’s freedom responsibility that’s your judgment call at the end of the day but it’s mostly like would you do that if it was on your own time and if you wouldn’t and all power to you but at least this way we don’t have to add a lot of rules in process another is our development and deployment flow so instead of having lots of nasty gates along the way and you have to go give the code to a DBA and they have to release it instead our our flow is best practices up front make sure everyone’s clear on those have components consultation tools and automation to sort of have the rails that your prot your your release process goes on so people are doing things consistently and then the big price we pay is on the cleanup so we let people get code out really quickly and then we have to go and clean it up and in many cases because you know things will build up technical debt and Croft over time I

mentioned DBAs I have one DBA in my team and I’m in charge of the the data platform and he doesn’t do mostly DBA stuff it’s more performance type stuff and database expert and the reason is that he doesn’t release code there’s no point I mean like I hate in companies where you give the DBA some code they release it to production they add no value in most cases there are exceptions in some places but if it’s just releasing code to production versus optimizing it what value does it add so instead we have a tool called bean john malkovich a lot of movie themes and netflix where you can flip yourself from a readable to a writable mode in the database world and say hey normally I want to be in read-only mode I don’t want to accidentally truncate lots of tables but I can just flip myself into a writable mode and then I can do whatever I want and then I can put myself back into readable mode later on or if I forget it’ll automatically revert me after I think it’s like five hours a QA team you know this varies within Netflix like if it’s production facing user facing there’s gonna be a little more QA within Data Platform we can be a lot more flexible so we don’t have a QA team we don’t have a process where you say you have to get everything reviewed freedom responsibility if it’s a complex change find one of your peers have them review it if it’s simple just release it upgrading is in a similar vein if you’re if we’re upgrading from one version of hive to another we’ll do a baseline amount of testing but we won’t test it to the hilt for months and months and months and not catch everything anyway well at a certain point say let’s release it we’ve done some baseline testing we gave people an environment they can test their stuff themselves and then we’ll do some firefighting on the day of release and we expect that’s gonna happen and part of that is accepting that things are gonna break when you do that like you can’t just say oh we’re not gonna fall I test it and then you beat someone up when it breaks you say well I know a few things are gonna break you still expect good code from people but you expect a few things are gonna break and then you you recover quickly as opposed to hold up the launch for a couple extra months and still not catch 90% of those bugs safety nets are really important I said version buckets for us three it was one thing we do we give the backups of our databases so there’s a lot of ways that we can recover quickly we can you know use that red-black push to another an older version to cluster another neat thing is we have vending machines and Netflix that you can get any hardware you want out of there actually has prices on it but it doesn’t cost anything weird that’s just context to say this is what it costs if you think it’s worth it then you know and you should just push the button and get your power cord or get whatever you want so and it’s the same thing like MicroStrategy licenses like if you log into my strategy it automatically creates one for you you don’t have to go through lots of processes and tickets if you need it you know you need it rules and processes as I said eliminate it whenever we can my first day people were beating me up at Netflix I’m like welcome welcome how about the release process we had which used perforce and UNIX scripts and all this stuff and now I told you we had a being john malkovich tool that just lets you release much for day-to-day stuff tell vs ask so if you’re in a big company and you say hey can I do this it’ll never happen so instead we just say that we’re gonna do this you either say don’t do it if it’s really really important or in many cases we’ll just do it anyway and then we’ll revert it if something breaks along the way but I mentioned before those are getting loaded in memory all the time we just send an email out every couple months and say hey we’re gonna expire this report unless you click on this link and to click them only stays around otherwise it goes away and then doesn’t really matter it’s probably another good thing to really think about on your day to day so my last company I remember like eight o’clock was this witching hour for getting our batch processing done nothing changed if it got done at 8 o’clock or 10 o’clock and everyone was on call and scrambling and getting yelled at and it just made no sense so at the end of the day is like do you want to put all your energy into putting your developers on call your source systems on call just to make sure that you prevent anything from breaking you fix it fast when no one’s gonna do anything different or do you wanna let them sleep at night and focus our energy on something important during the day so the last thing I’m gonna cover it’s just a Netflix culture deck do a google search on it it’s not fluff it takes some of the themes I said before and I think it’s a it’s a quite interesting read on how we do things at Netflix and again might be some good principles for you so with that I turn over to Justin who’s going to deep dive into mantis okay so I’m Justin Becker I’m an

engineering Netflix I’m excited to be here to talk about a new system under development and Netflix called mantis and so to start the conversation I want to provide some context about what motivated us to build mantis and I like to think that we were motivated or inspired by the traditional television so one of the things that I think we take for granted is when you get home at the end of the day you turn on your television you press play your TV just works at Netflix for internet television we want our customers to have the same experience when they get home later on their Apple TV the Roku their Android device they select the movie they press play we want the experiments the experience to just work and so the question we had to answer as well doesn’t efflux work well people get home and they press play how was their experience and really this actually turns out to be a non-trivial problem because when you ant when I answer the question is does a Netflix work you really need to know does Netflix work for everybody it’s not okay that it works for the majority or the popular devices we really want to know does it work for everyone just like your traditional television and so to motivate or help kind of provide some context here let’s start with a simple question a question that Netflix should be able to answer when you sit down and you press play for house of cards does it work or if you wanted to get more specific when you sit down in Canada and press play for house of cards does it work or even more specific if you’re in Canada you’re watching on a Roku and you press play does it work or you can continue this logic you know on a particular ISP does it work in a particular City does it work and to generalize you’d like to be able to answer this question over many dimensions bit rate UI version firmware version hey beat SL and really at the end of the day we’re really interested in answering these sorts of questions for billions of permutations and by the way we’re going to do this automatically and in real time and so when we thought about what we would need to build in order to answer those types of questions we really we realized pretty quickly that we needed to scale our insights capability and so the name of the system that we put together to help scale our insights capability as mantas we like the mantis shrimp because there’s two interesting facts about the mantis shrimp it has extremely good vision and it’s able to strike its prey really really quickly so we like those two properties as properties of the mantis shrimp and so you might be asking yourself well why are we building yet another stream processing system their storm their spikes sparked their Samsa and these are all really good systems and we evaluated those systems but when we thought about the problem that we were trying to solve mantis actually turned out to be solving a unique problem for a particular domain and that domain is the insights domain and what I mean by insights is when I think of data and deriving value from the data insights the data is really deriving value and giving you insights into complex systems and I like to contrast that with the business domain where I think the value that you’re deriving from the data is to make customer insight decisions so for example when you launch your new product and you do an a/b test cell or you’re trying to identify customer trends and growth rates and so when you think about insights problems there’s a set of requirements that are a little bit different from traditional business problems insight systems are more cost sensitive so to give you an example for this you don’t want your insight system to cost more than the system that it is monitoring another key requirement difference because you want to prioritize throughput this may sound a little bit strange when you compare it to these big production systems but at least at Netflix for every customer request that we get that actually creates many insight events or measurements from that customer event so typically our insight systems need to scale it a factor larger than our customer facing systems another key requirement is we really want to optimize for low latency so when I was giving you the example about this house of cards work it’s not enough to know did house of cards work an hour ago or last week or last month we want to know does it work now and we want to be able to answer that question for billions of different permutations another key difference is we’re ok with tolerating some amount of data loss as long as we can get a good enough sample that’s all we need to answer these types of questions and so I want to focus now on you know how we mantas and because this is the scale conference I want to emphasize some of

these decisions you know with an emphasis on scale and so I want to focus on three particular areas utilization latency and throughput and so for utilization it’s all about maximizing utility for the machines that are running your software and so for mantas we began by designing it for the cloud and so in the cloud we get two nice benefits we get this capability to elastically grow and shrink the infrastructure that mantis runs on and we also have the ability to elastically grow and shrink the jobs that are running within our infrastructure okay so how did we achieve low latency so the way that we think about low latency in terms of mantas is everything is asynchronous from top to bottom starting with our network stack so we use an eddy of the network layer at the programming layer we use our rx reactive extensions and really that gives us the benefit from top to bottom to have an async architecture so the next challenge I wanted to talk about is throughput and throughput is all about pushing data through pipes right really maximizing the amount of information that’s flowing through your infrastructure and so to maximize throughput we needed to provide an integrated approach to back pressure and so what back pressure is is imagine you have this complex networking of a pile of pipes and one of those pipes you have a clog you don’t want to just continue to push more and more water through that pipe because your wrists bursting it and so what you tend to do is have a an approach where you sort of throttle or limit the amount of water that goes to that piping sleeve and unclog it and so for mantas we took similar approaches like that to have an integrated approach to to back pressure and really that’s about having a cooperative strategy and what that means is you start at the network and you look at the network and you say could a network actually push information to these subsequent stages is in my job or my execution flow and if the network can’t keep up you need to throttle back stream another approaches that is processing back pressure so within mantas we can actually detect that a computation is taking too long and we can take steps previous to that computation to slow down or sort of limit the amount of information that’s flowing through those pipes and because this is a holistic approach we can actually inform our scheduler when we can detect that these back pressure issues are going we can say hey scheduler you need to provide more resources to this job so that I can scale up to accommodate the amount of information flowing through it and so now that I’ve spoken a little bit about mantas I wanted to give you guys a feel for what Amanda’s job looks like and so amantha’s job is composed of three abstractions we have a source we have one or many stages and then we have a sink and the source is all about getting data into your job and it also provides some basic back pressure strategies so this is you know I’m pulling data into my job and my job is unable to process that amount of information what sort of strategy should I take and not just continue to try to push information through that computation the stage is really the unit of processing this is where the the user of mantis writes their custom processing logic this is also our unit of scheduling so when we’re scheduling a job to run in our infrastructure the stage really dictates how that information gets spread over many machines the sink is the output to the job this also has back pressure concerns because if you imagine many clients connecting into your sink you need to be concerned about if you have one slow client it should not affect the entire job and so I wanted to provide an example of what a job looks like so I know this is code and it’s hard to read but I try to walk through it here so the blue information is just the mantis API itself and for this particular example we’re sourcing data from Twitter so we’re getting the firehose and we’re actually filtering it by any tweets that have Netflix in them and then for each tweet we run our NLP sentiment analysis to save our people speaking positively about Netflix are they not speaking positively and then we take that sentiment outcome and we group it and then in our next stage all we do is buffer that sentiment by some amount of time and count them and then we output that information to a graphite graph and so this is just a real quick example of how you can take the Twitter firehose run a sentiment analysis algorithm on it and then graph the different sentiment outcomes of or time and we do things like this to answer the questions that I presented to you earlier words you know I want to track all of the movie failure rates for the entire catalog and Netflix and so this is what a job would look like in that infrastructure okay so that’s a job of what is mantis itself so really mantis is a fault-tolerant master with fault tolerant agents and we have an intelligent scheduler that’s managing the work that is done by those agents and the work that those agents are performing is just operational insight problem they’re trying to solve these operational insights problem using those job a job instruction like I

showed before and that’s really the end of my presentation if you guys have any questions feel free to contact me your Kurt no thank you for your time any questions hello hello so I would like to before we can do Q&A we had a lot of trouble with the with the iPad app last time so I’m gonna give an actual demo so we get it right this time so this is really really simple I’m gonna go with this again so there’s a red which means you didn’t like the talk a yellow which means you moderately liked it a green which meant you really liked it so just tap once on one of those buttons and that’s all that is needed to give feedback thank you you said you wanted to analyze like a different dimensions when a failure occurs like a ISP or country or something and you said there are millions of possible combinations how do you actually generate those combinations or how do you come up with those combinations to monitor so we don’t do ahead of time so it’s really the byproduct of the job that’s written so if you’re gonna track those permutations in your job you would need to group by and partition by those partitions that you’re most interested in tracking and so what I think about mantis jobs I think of them is really managing billions of streams of information and computing analytic results for those billions of little small streams so it’s really a byproduct of the job and we don’t store any the information the information just flowing through the system and we’re computing those metrics for people doesn’t so the problem of figuring it out what dimensions to monitor know it doesn’t automatically do that for you we have a layer that’s built on top of Manas that helps doing anomaly detection and outlier detection and things like that but at its core it’s really a processing framework to do real-time real-time analysis thank you thank you so you got mentioned you got use f3 for the backbone for the story most of the data so how do you got it with like the sensitive data that you so have for example privacy data well like to make sure it’s secure and also gain the chest of the user yeah so for the most part we just don’t store it because it doesn’t have enough analytic or at least the trade-off in our world not worth it so Cassandra obviously it’s gonna have to have it because we’re we’re serving and user requests through the validate things but we just we just keep it out so it’s it’s just a trade-off that we’ve made and it lets us be more run faster and have less controls in place so use a behavior data for example what watch movie that use watch would they consider like sensitive data or nonsense I mean I guess from a VPP a perspective for the video Privacy Protection Act it potentially could be so you know there are lots of controls I mean there’s security keys and all that stuff and we don’t make it publicly available to anyone outside of Netflix but we don’t have HIPAA data for example or credit card data that we store there or usernames or emails or we just try to keep all that stuff out because again we’ve just made that sort of cost-benefit trade-off and if someone had to do some really custom analysis we would take it out of Cassandra put it in some side system and you ate the analytics there and I think that lets us run faster thank you so kind on a follow-up for the previous question so it means anybody who wants to analyze data using mantas has to know beforehand what dimensions they want to slice or dice the data on or are there tools built on the can you have once you have process the data can use to slice and dice in different ways yes the way the Mantis works is you have a mantas job and mantas job can have parameters so you could specify it’s not so much that you’re you know ahead of time what you want to track but you need to know the general structure of the data that you’re analyzing and then a run time when you submit that job you can actually inject parameters into that so that specifies what of those things you want to be tracking I don’t know if that it’s definitely not an a priori ahead of time like instrument your code and the metrics will flow into this data store it’s very much I’m gonna write this job right now submit it into the infrastructure and then one I submitted that analytics flow starts to happen and I get the output of that so at the time I submit the job I do have to know sure so an example would be you know we get a customer you know CS calls and says hey we just did a release in France and we’re getting some calls saying you know people can’t start play on this particular device at that moment then we would launch a job that basically is tracking that device over many dimensions giving us insight into what’s going on all right thank you thanks for the talk which throttling mechanisms

have you found best reduces back pressure decreasing sample rate or sample size or really used by you case you know analysis we tend to have queues and we have real small queues we don’t fall behind and the queues are just there to kind of manage bursty behavior but it’s really underused case by use case basis some use cases just say drop the data some use cases say hey I want to have a small buffer to accommodate for bursty traffic some of them actually have a sort of a cooperative algorithm where they do a push poll or they say hey before you even send me any data I’m gonna send a mic release to you or I say here’s you send me a hundred they send a hundred you acknowledges sort of like a tcp slow-start up curve and so there’s a ton of different strategies depending on the use cases and this is all for insights in particular for the most part yeah we’ve explored other use cases but mostly for insights yes thank you for the example you gave for the Mantis jobs each of those stages could those run on different machines they definitely will and so when you write a stage the stage has a vertical scaling configuration has a horizontal scaling configuration and those are learned over time by our scheduler so the first time you run a job you might actually have to bootstrap it with some of that information but over time as it’s running our system is adapting and learning about the behavior of the job looking at those back pressure signals and then figuring out how to sort of scale it organically over time is there a strategy to branch could one stage push to multiple or would you have to go to a sync that would then notice we do have branching strategies so you can do like a multicast unicast you can do partitioning you can do collapsing so we do have different strategies of communication between the stages for a job thank you yeah could you please tell talk a little bit more about the Franklin the made her data storage and for example how does that interact with the daily head of job and how do you do like updates that made her information yeah so it’s not open source right now it came out when H catalog was really really new and we needed pig to be able to interact with hive and each colleague just wasn’t ready we tried using it but in most cases it either proxies data or it can pass through and say hey I want to get this which made it metadata into its catalog and then as you saw through the Big Data API you interact with that and say things like giving you all the details on partitions for example so an example where it’s used is in our in our pig pipeline because we’re an s3 and because of this eventually thing we use what we call a batch pattern so we want to make sure that pig can interact with hive and what it does is it never overwrites the same location it Franklin itself will go okay well I’m gonna when you’re about to write data I’ll write a new partition I’ll keep the old one around and when I’m done with my processing I’ll flip them over so sort of it’s just it’s a metadata abstraction then you can interact with this business for Curt in one of the slides you mentioned that you’re using dynamo DB to solve the eventual consistency problem but as far as I understand dynamodb also has the eventual consistency so could you explain more so can anyone else agree or disagree with that like my understanding is that it is consistent index if it wasn’t nice thing it is actually pluggable so you could change it to any any back-end you want but my understanding is that it is consistent okay no so it’s configurable so for my listeners you can configure dynamodb when you’re making a reader right you could say I want this to write to be more consistent than the default behavior and so I think for these use cases we’ll probably set that consistency level a very high level so you can have good guarantees about it being you know there when we query it so you use strong right and I know we did that with Cassandra configu Lee as well you mentioned that you have moved over to use the party data format from arc is there a specific reason because they both offer very similar functionalities and kind of similar performance improvements so what was the reason for you to move to and what what advantages that you found over right so a good question actually we didn’t move over from work where we moved over from sequence files so we’ve been using sequence files forever for us was more a question of what do we choose and those are both like what do we do if we were a pure hive shop then we would have gone with orc but since we use pigs so heavily and we use hive and work is really more optimized for its hive there’s not a lot of traction in the pig community it was it was an easy sort of splitting the difference so we definitely didn’t move from work and when we’ve talked to other companies and they say that they’re a hive only shop we just say work is not a bad choice at all thank you thanks for the talk I had one question about use use of storm

so basically could you elaborate some cases where you found storm was difficult to integrate or implement and that’s why you chose so I think for us the biggest difference when we evaluate a storm Sams is that they started from the two two reasons so one is that they were sort of designed or built initially with the datacenter in mind and the other was that they started from the position of strong guarantees and strong consistency and so for our use cases we knew that we wanted to move into the cloud so the first decision we said we could we could take storm we could pick Sam’s and because sort of bolt on some stuff so that it could run in the cloud in our infrastructure and that’s okay that’s fine but for the first or the second issue is the stronger guarantees and stronger consistencies we really wanted to sort of trade off and have less consistency and gain more throughput in lower latencies and that’s really a story that’s related to the us trying to stall the incites problems well we don’t have to have strong Garen jeez about was that data delivered all we need is a good enough sample to create a signal to know if something good or bad has happened and would actually prefer you know huge amounts of throughput low latency and losing you know a portion of the data from us for our perspective that that’s an okay proposition and so we looked at storm in Sam’s and we actually met with both the you know the creators and the teams that are working on those we talked about what we were doing and they sort of agreed yet from a starting position it felt like what we were trying to solve was fundamentally a bit different from what these frameworks are trying to solve thank you cool thank you Thank You Kurt and Justin so we’re done with the morning session we’ll break for lunch now lunch is on the fourth floor so please walk up a floor to get lunch and we have t-shirts right outside and we’ll meet again at 2:00 Thanks