Data Science Happy Hour 14, 18DEC2020

I hear what’s up, everybody, welcome to the @TheArtistsOfDataScience Happy Hour It is the holiday edition of the Happy Hour Welcome, everybody. Hope you guys got an opportunity to check out the very, very special episode that I released on Monday It was just me talking to you guys, and I made you a mix tape So definitely check it out People are funneling into the weight room like crazy It is. It’s going on and it’s happening Welcome, everybody. Welcome to The Artist, their science Happy hour or so. Happy to have you guys here So I want to kick this session off with a question that was sent to me on email by one of our listeners named J And Jay wants to know what is the day to day activities of a Data scientist and a non research organization? Do they work from home or similar lifestyle like software engineering or other types of jobs? What’s the work life balance for a Data scientist? I mean, I’d say the work life balance is pretty well considering that all of us are here in various time zones hanging out in a massive zoom chat So I think work life balance is quite nice for data scientists I love to hear from this here from Tom And also, I can’t believe there’s so many freaking people in here This is awesome. Hey, everybody, I have to personally rejoice and I almost want to do a jig because Susan Rice is in the house where Susan and guys, I know I’m not putting her on the spot. She’s planned a special Lip-Sync for us, right? Oh, really? Yes I’ll just go watch one of her many Lipsy post settled for you Yes, I will. We need to have a proper singing session, Susan We have a need to do it That’s what we need. Yes, that is Well, so Harp your question and I don’t see this completely jokingly I don’t know that balance and Data scientists go together in any sentence or even any paragraph. I’d like to borrow from the extreme wisdom of Gary Keller, who was the founder of the Keller Williams real estate He wrote a book called The One Thing That I Had to Read because I’m a trier holic That means I was I was I still have to fight, not doing too many things And he made a point There’s no real such thing as balance When you’re trying to excel and do good things It’s more about every once in a while you got to stop and try to rebalance And I I’ve found over my many years around the sun that that’s pretty accurate and I’ll shut up. And it’s a cliche thing that those are not work life balance, work life integration. I don’t know if I believe that I don’t really know what that means, but it’s the whole thing It’s trending right now. Maybe it’s crap like that, work life integration So let’s hear from my good friend John Sebastian Josh Bashan is one of the mentors that this dream job, one of my good friends superexcited you can make it, man Yeah, well, I’m super happy to be here, of course So it is my first time, which is kind of strange considering that we worked together and still haven’t I haven’t had the time to to stop by and say hi to everyone But hey, here I am So so yeah. What was the question again How is it to, to be a data scientist outside of academia Yeah. Like yeah What’s the day to day activities in a non research organization And you come from a research background right now You’re kind of. Yes Yeah. I spent almost ten years, if not a little bit more in research and then I switched to to the industry maybe a year and a half ago So so yes, I would say when it comes down to my day to day, of course, you know, every day is different. But generally speaking, I would touch base with my colleagues and we would well, we don’t do this every day But still, you know, just to just to make us look very, very good Basically, at the beginning of each day, we would talk maybe fifteen, thirty minutes just to have an idea of what we did the day before, what we’re planning to do today And we would just, you know, exchange some ideas about how we could tackle some problem or some issues that will be having probably during the day And it’s a good time for us to explain our problems, which is always the first step to everything, and also to get feedback from colleagues as well And then, you know, we basically it starts meetings after meetings, after meetings and

then. You realize that, hey, you know, I kind of have to do some work, but yeah, so that would be probably how it starts And then, of course, you know, we all have our individual projects and we try to to do as much work as we can Being home is it’s awesome But at the same time, it’s awful because you really need to to to get, you know, a specific room for you to work Otherwise, you know, your work life balance is just out of the window So, yes, I have my office here, so it’s OK And yeah. And I think the most important thing is to make sure, like every hour to take a break, because it’s easy just to sit here and just work I think there’s something that is missing, not being like at work and we don’t realize that. But, you know, every once in a while someone will come up to you and say, hey, you know, let’s grab a coffee. And without thinking, you just go And 15 minutes or half an hour later, you come back and you start working again And fortunately, unfortunately, we don’t do this, you know, in this situation So we don’t realize we work a little bit too much And after a while, we just get tired and we don’t know why And now it’s Friday It’s five thirty here on the East Coast, and it feels good to be here So. So to answer your question at Data, scientist is just a regular ass job, just like any other job It’s not that much special, except we just get to use Data and science Anybody else want to talk about what their day to day is like? This was a question that came in from one of our listeners, Jay He just wants to know what the day to day activities of the Data scientists and non research organization are We’ll get one more response, then we’ll open it up for other questions Then what’s your day like? Needs to be better on the and turn on and off for me is kind of crazy I think I’m a little bit unique because I do the strategy side of the house So I take a lot of meetings I spend a whole lot of time not talking and then maybe say in two or three sentences and that’s, you know, an hour meeting Sometimes we also do the hands on building, developing some ideas, chaos And I’ll take breaks for like an hour to actually sit down and think we’ve got that huge whiteboard behind me that sometimes helps Sometimes just as ideas sit on, it will never go anywhere And I think creativity is what I struggle the most to keep in my day to day sort of work life balance, because a lot of times I’m building something or trying to figure out a problem and that just bleeds into my life My brain doesn’t stop working on that problem The fact that I’m working at home and I have been forever, you know, leads together It starts to take off the calls downstairs And I’m at the kitchen table having lunch And, you know, again, it’s just kind of leading in code late night or early morning And again, you know, it’s it’s it’s so easy because it’s right here I can do work any time I want to And that’s supposed to be the excuse for me to be able to do only eight hours a day or less than that. And it ends up in the recent I worked 10, 12 hour days, sometimes end up working on the weekends I’ve got to talk that I’m going to be working through the weekend to prep for it Work life balance I think I was just a pure data scientist I think you’ll be a little easier And if I was in an office, I think it’d be a little easier But it’s a combination of covid and what I do, it’s integration I think you’re right. Maybe, maybe invasion’s the right word Workplace invasion It’s interesting. Take Manso, Jay, if you’re listening for tuning in That’s a day in the life of data scientist I guys Well, I just wanna take a minute here just to recognize everybody that showed up A lot of friends of the podcast we’ve got we’ve got a vicious they just heard him He was on an episode with me once, Kubernetes and Sreenevasan here doing two two lab recordings one day. Man, Yes, he had some stamina’s events And who else we got? We got Tom AIs We got Dave Langer, Joe Reese We got Giovana We got Greg. Susan Walsh Oh, my God. George, Leona, Jennifer, Mark Rey, Monica We got so many awesome people You guys are absolutely amazing Thank you so much for saying that, LinkedIn Yeah. All right Pretty much ahead. All of all of LinkedIn Data community in one spot This is awesome to see you guys Thank you so much for taking time out your schedule to come hang out with me on this Friday right before the holidays I couldn’t pick any better place to be than right here right now with all of you guys So thank you so much for being here So we’ve got a question in the chat from Saurabh Saurabh You want to go ahead and yourself and we can help you out here Hi, everyone. This is my first meeting

Plans to open office I was. Harp, I’m a big fan of your forecast, a lot of big names here, so excuse me if I’m asking some stupid questions, such thing So I am a project manager, basically being 15 years in the industry I’m really curious and keen to switch to Data science I’ve been reading about it being connected to most of the big names here and my first day of languor, everyone and mostly everyone My question is of someone starting new There are so many avenues So for example, we have a fascinating science data scientist program Then we have W data scientist Then we have, as your data scientist, how relevant or how useful that is in the industry So I myself in the industry were not in data science, but if someone has to pursue any of these programs when you might be interviewing people, you might be recruiting data scientists all around how relevant authorities branded certifications out there All right. So that’s a great question, actually So if I can distill that down, it’s there’s so many different tools out there to pick up And some of these tools end up offering some type of certification at the end How do I decide which one I want to take and what’s the benefit of picking one over the other? I kind of get that right So I can say that when it comes to SAS, like SAS is really using very highly regulated industry. So I would say biostatistician working in a pharmaceutical company, SAS was huge, like we had to do everything in SAS And then when I was actually in the insurance industry, again, everything had to be done in SAS because it was a quote unquote, I guess, validated software, as they call it So for certain industries, you can’t really use open source Those are two examples right there That being said, I’d like to flip it to let’s see if it’s robots and wants to help tackle this one. Yeah So I think you rightly pointed out arbitrating SAS is still used if you take banking or many related industries, some of their traditional risk model or even like forecasting model to Ronstadt’s know I would just come back to the question, what do you want to do? Because SAS and Tableau or other tools are like a different part of the lifecycle and you’re coming from project management now You have other avenues as well to enter into datasets From a project management perspective, you can just kind of move yourself to more like project management. I know it’s not an easy transition I don’t want to kind of say, like, it’s going to be easy, but at the same time, take your domain knowledge that you’re working on a project management It’ll be working in a practical in a particular domain Can you take the domain knowledge and get into project management where you work with multiple stakeholders, set up the background and also like create a road map for the Data project? Right. That is one part And then slowly see what tools to use it because you have 15 plus years of experience If you are good at Dominy to start with, that are now coming to do, then you can decide you want to be on the visualization side or you want to be the modeling side That’s where SAS or was this tableau comes into play Because if you take tabloids, one of the leading tool in the visualization world, follow other tools. So definitely these tools are useful, but it all depends on how you want to transition your career into Data science is excellent point So that’s a great point about which tools to pick And then when it comes to picking which resources to pick, I don’t think there’s any like one magic course that’s going to teach you everything or separate from anything else in terms of content. But there are courses that are done with trusted, reliable, long term thinking individuals who really put in the work and effort into how they play the content. For example, Dave Langer, he’s got an excellent course, so they would love to hear about what you think about his question, about which text actually I pick up, what’s what’s the benefit one over the other Yeah. So my perspective is based on being in the technology industry for more than 20 years, technology comes and goes So I was around for a long time before it became hard Python was around for a long time before it became hot Python in our will eventually be called technologies over a long enough time line Trust me on this, it’s going to happen So picking a particular technology stack, I would say, is going to be a tactical decision based on researching the kinds of jobs and companies you want to work for in the kinds of things you want to do for those companies First and foremost, because a lot of hiring managers are just going to check the boxes Do you know this? Do you know that great understanding the base concepts is what’s really

super important. So I tend to focus my Off my content on the kinds of things that kind of will stand the test of time, like SQL, for example, you cannot go wrong with learning SQL if you don’t know it And that’s independent of a technology stack However, if snowflakes really super popular right now, sure Learn SQL with Snowflake makes sense, but I would keep that in mind Tech stocks are a tactical decision at a point in time, which you really want to focus on are the core concepts that are going to be reused and useful for you long term So that would be fundamentals of machine learning, statistics, data access and data management like school, things like that Rather than worrying about our Python or RWC versus Azure or whatever So it sounds like principles, never tools, will change over time So focus should be just on picking up the principles that will lay the foundation for you to build on later in your career as the tides change and just having the mindset of adaptability to be able to just change whatever tool you’re using and be open to experimenting with it. So thank you Yeah. Dave Awesome. So I’ve got with the chat here and I’ve organized some questions as best I could So next up, we got Ashan with this question, so I should go for it Hey everyone Sorry, let me scroll to my question because I forgot what to ask All right. So Hurford so yeah So what are some trends you’ve noticed on LinkedIn recently? Did you take a course on how to LinkedIn? I I’ve noticed that people, you know, people like their own posts to make it more visible to their connections or certain, like tips that I picked up But I don’t know if there’s like a course on how to LinkedIn that people are taking or, you know, what’s what’s going on Is there something I’m missing out? That’s a great question. So in terms, of course, on LinkedIn, I’ve signed up for a few and maybe I’ve watched a few of the videos and just didn’t finish it So I’m notorious for like buying courses on how to LinkedIn and then just not figuring out how to Lincoln So I guess I’m kind of not the best person to ask for that How about how about, Susan, what we need to do to LinkedIn on LinkedIn So I have not done any courses and I’ve been pretty successful without needing to ever do a course. I find that courses get used to not be yourself So the most important thing is talk in your own tone of voice, you know, present in your own style, how you would talk to your friends, your family, do those kinds of things and don’t be afraid to test things and fail and learn And, you know, it doesn’t matter if a post bombs, you know, nobody will remember it tomorrow. Try loads of things I and and be patient So it’s taken me a good year and a half to build up my network, my my brand And you can just figure out who I am and who I want to be I’d like to comment on lots of posts I do like my own posts, which is just habit Now, I don’t know whether it makes a difference or not, but I do it I don’t use any automation I do everything myself I think especially with the more followers you get, I think you’re putting yourself at more risk if you have any kind of automation at all Why else have a mix of content as well? So it’s really important to show your skills, show your knowledge, but sure of yourself as well. So show some of your personal side, maybe some fun bits to need to do Lip-Sync But that’s working for me Somebody who I see is super positive on LinkedIn like just everywhere is Giovana So Giovana, what are your tips for LinkedIn? I can’t find you on my screen, I hope Still here. I thank you too I think the most important thing is I agree with everything that I have said So, Zach, I love your style Your tone of voice said I’m your fan And I think the most important thing is to be authentic because you have to show what you are. Don’t pretend to be another person Just be yourself The most important thing in our community that is sharing and caring And I think sharing knowledge can help our community to grow is the most important and that we can give to everyone and all all the people who is today here, they know how to how to do that. And I love when I, I write every post of everyone that is today here It’s amazing to see everyone here is like everyone has his own style And I think this is there and there the human touch that is we have

Every post that we we publish and we need to maintain that, and I think that’s why we have almost the same followers, but they love to go to one place to another, maybe the same idea, but with a different perspective And I think this is an add value to the information that we share Thank you so much. We actually have a LinkedIn We have a couple of LinkedIn top voices and topolice alumni is here So, Greg, let’s start with you So Greg is a newly minted LinkedIn top works in data science So we talked about this, by the way, guys, I have an interview release with Greg early next year, recorded maybe a couple of months ago It’s been a while, but I remember during our conversation we have this this talk about how to maximize LinkedIn So share some insights with us Yeah. So for me, it’s kind of like the it’s like a kind of different steps, right? You you pass the threshold of caring what people think and don’t take it personal Right. And then be in a mindset of, you know, sharing, like just just giving off new information that you learned out there for me I see a lot of content that are super tailored to a crowd with their knowledge about a subject or SQL like Data science For me, I get lost Right. I can understand it But from from a content creation standpoint, I kind of look at it a little bit, scope it out, scope out and look at it from a distance and understand why So why do we have, for example, computer vision? What is it good for to use case, what are the use cases and then kind of pull stories that are related to new discoveries So but I cannot go to the specific about how an arboretum works, so I kind of take a different spin on it So where I discover something new that can connect with not only the specialist, the specialist in the Data science and also a businessperson who can also connect with my content. So in this case, I kind of to me, I grew my base And also another thing, too, that’s good is create some sort of cadence Do you want to do it on a daily basis? It does work that people subconsciously expect something new from you from time to time So it could be that from a time range that you choose, whether it’s from between nine and 12 p.m., you want to pull something out And then the other one, too, is always be open to somebody else Just slapping that post right back at your face with something that you didn’t really know or expect So create that way of communicating and getting the conversation going One of the things that they told me when they nominated me was that the conversation was good whenever I posted something And to favor that is, I make sure to respond to everybody’s answer If you comment on my post, I’ll make sure to respond to it And most of the time, my learning growth happens when somebody shares something new with me on the comments So that’s what I did And I meant, thank you very much, guys For everybody just chilling See the reminder that the the chat is popping off So definitely be active in the chat is going to be saved and published when this episode releases as a podcast episode So keep an eye out for that link there So can I see you got your head up? I added you to the Q So we’ll get to your question I’ll get to your second question later on in the program here Next up, though, I’ve got Mark Mark, you’ve got a question Go for it. I was just curious people’s thought process of how they got through debugging for my current projects, basically building out a whole NLP pipeline and coming across a lot of different bugs that I create And so being able to slog through them and get them done But I definitely want to get more efficient at them So I’m just curious other people’s steps of like what’s their critical thinking process of like this? Like Wagener, what’s my first thing I do then? I check this I Nozomi that every single time, but I feel like it’s almost an arts And so I’m curious how other people approach it and actually love to hear from another fellow student looking at you Leon actually works at IBM, if I’m correct and she does big things over there So I’d love to hear how you go through debugging code I everyone, that’s my first time here And that’s a really good question, I guess

Blogging is really important and it can be challenging, but as you do your coding, you’re kind of learning where to look, how to deal with it And when I started my career as a data scientist, I was panicking when I was seeing errors or blogs But as time passes, you kind of know where to look at, like the last line or which specific lines. If you were coding in Python, you should pay attention And just like anyone else, I keep Googling what a certain error may mean or how other people specifically stack or full deal with it So it’s just like trial and error, sometimes three parts And how do you suggest doing some debugging? Ok, so you mean the whole application will try? Debugging was pretty easy because the tools and everything supported it Let me let me maybe go into one level kind of detail on the main thing I noticed Like when we are looking at a data scientist, debugging is the toughest part because you have an entire pipeline that you built from Data sourcing to your final deployment or insight completion and the entire pipeline, even at a place it is it’s very difficult to know that it is failing because we as data scientist, sometimes complicated without writing a structural code. We just take a single notebook and push everything inside and think like, OK, we are done The very first thing, this modularize code Alex played directory structure for your preprocessing step, for your what are the visualization outputs tab, the models tab and and also the model in the production audibert. It cannot be Warralong So the insight in the production and use the right libraries, be it like you are testing libraries like Blindest or other things just before going The second is logging is very important that you see, it’s the the critical you need to make sure you log everything to a file Also in the cloud, you can use cloud services where all the logs are centralized Now writing 10 different logs for each and every process is going to be even more complicated. So your logging has to be thought of a horizontal capability in that process. So you’re logging framework upfront, trying to integrate the log and try to create also links to the model metadata and model But that’s where the analysis comes into play if you’re talking about machine learning in general. So try to create a more kind of outputs because like your kind of insight cycle is going to be multiple experiments Each experiment will give you a separate results, right? Or separate metric? No. If you keep on changing your pipeline and then you are generating and said you don’t know which by which metrics you got for which model called Right, so start logging each and every model metrics and this logging frame But any company has to invest upfront or we have to invest in creating that so that tomorrow if you want to go like what I did to this bag, you’ll feel that was the best insight that you got You can always go back and revisit and the logging will allow you to exactly Go ahead and get there Right. It is happening And you can then maybe in the code itself that are way to Data such a stack overflow and the results, you can do that or you can do a manual search So that’s what typically we do For logging analogy, Carlos, is it different with our when it comes to debugging? Does does the principles change depending on on programing language? So let’s hear from Carlos about debugging then after Carlos from Yeah, I think everything she said is exactly right I would have thrown at our have Penn’s package that lets to save the status of various things. I think in Python, Megaupload does a bunch of stuff for that too But I was just putting the Chata at our studio Twenty, twenty, twenty Brian had a whole workshop on debugging I went to all three days There is also a keynote speech about debugging I highly recommend that I LinkedIn have a chat, but also just the number one way to debug it, to not make bugs And the way to do that is to really have in your head what the desired output is before you throw your input into a function I see a lot of programmers are like, yeah, and I took this code off the Internet and I worked for that guy and it is Data frames I didn’t work for me. Why is it broken? And of course, you know the proper thing to do to make a real practical supplementary use traceback use browser stuff I really like to have your input output in your head as you code so you’re not debuggers Lose less. I said earlier today, but try to get out of the situation or you’re accidentally causing bugs and do that If you think about your input, that outputs are excellent points I think that better flow is an awesome package, actually Just recently found out about that developed by Netflix It’s open source, completely open source

I plan on experimenting with this in the Next coming weeks, that’s actually a really great topic that I’d love to hear from more people on, I think we can even get into some of philosophy of debugging So let’s ask a couple more people what they think I’d love to hear from Monica about the And after Monica. Let’s go to George, everyone I came with my bill line I just have to general things that I wanted to share when handling errors So what I like to do when I’m learning something new is kind of fail on purpose So I just break it, see what error pops out And then if you do that enough times, then when something comes around again, you wish you would notice. All this has happened to me before I know how to fix this And then another one being be very active on sites such as Stack Overflow You use that, you probably use that to to research your own errors, but really get into like answering other people’s questions I think that really helps you solidify what’s going on in the back end as well George, there’s one thing I do want to mention Great answers. By the way, I’ve started my Data career as a software developer, and I think there’s one commonality between software developers and data scientists, and it’s the fact that we we tend to go at tackling the problem so low And that’s not a bad thing necessarily But it’s sometimes valuable to try and ask, help, train, ask your colleague or somebody else that you might not know the issue at heart as well as you do But just getting that fresh pair of eyes definitely helps And I know if I’m asked to do debugging for somebody else, I’m not always happy But sometimes it’s it’s nice to tackle that challenge as well and see what he can find And you know what he can save you a few hours at instead of digging it yourself and finding out where the bug is Thank you very much for that awesome, awesome tips and advice on debugging I just have one more thing to debugging Oh, definitely. Go for it. Yeah One of the one of the things I’ve noticed when I divide my code is 90 percent of the time it’s something really stupid, like a misplaced comma or misspelled word or something like that. So I would say if you find yourself spending more than 30 minutes trying to solve a problem, there’s a very good chance it’s something that you missed and it’s usually something really stupid and simple So maybe check the simple stuff first and that might save you some time Awesome. So I guess to also address that issues have a good idea I use voice code, whatever you guys use, type it out in the chat I’d love to hear and see what it is that you guys use, so definitely type that out So next I have in the queue floor in there for flooring I’ve got Jennifer, who was the spark off a python versus a hard to beat Who should be a good one in that after Jennifer we got Okay, so Floran, are you still here? Yes. Awesome Go for it. So my question was regarding Googe scraping Lovington Data said So I’m planning to kind of scrape the whole world Webbe No sense And the four different websites like on Goodreads you can find what people read on other websites, you can find where they travel and other things like that, and do like the discovery of Hobby Database discovery And I’m curious if people worked on this and also I know legally some somehow complicated, but also I don’t know what the challenge is So how do people see it? So you’re asking us to give you advice on how to scrape personal identifiable information on the Web? Is that they hear that, right? So I know how to do it technologically and having the database and putting Data more in Also, maybe if the projects that are doing this in different universities So I don’t know. Yeah, you’re saying that it’s a bad thing, but there’s an entire industry called open source That doesn’t mean our open source that’s focused on scraping everything on Reddit, everything like Twitter, everything on Instagram, everything like if it’s public government that you don’t like is recording it So I just like one of the things that I’m doing I’m mostly scraping like the users Not exactly. Not that much, though, because people this is too much capacity for me to handle. So I prefer to have, like, metadata about the user Like, the difference is not the thing I got that you don’t get that much of the post because I don’t have that much memory Anybody want to take a stab at that? I guess I’ll go ahead Go for it. I guess one question I would ask is, so is this for research purposes or is this for commercial use? I like Data do so for me is like flinger the data and seeing I don’t know how I could monetize these probably though they could be different, but it’s mostly for me because I, I like to see what would be the results Got it. So, so, so, so in general I don’t wan

Make this a blanket rule, everyone, please feel free to argue with me on this, because I’m sure that is wrong. But like, if you’re like an individual and you’re doing it for, like, research or like, you know, toy Data set purposes, typically they won’t go after you for that. Some websites, if they do have like, you know, a kind of scraper sort of initiatives going, it will be a little bit harder to do that if you are doing it for like eventually if you want to monetize or commercialize it, that’s where you could kind of get into trouble potentially Well, it depends But in general, just like don’t do it I just think it’s safe to say don’t do it Secondly, like also from I guess like a lot of places they offer APIs That’s honestly like the better way to go about is if you can get access to, like, legitimate API, I would just go do that And then like the other thing I would consider is like, if you really want to, you can you can look at scraping surfaces I’ve already prescribed it It’s one of these things where it’s like figuring out like, are you are you doing it because you’re just like interested in like learning how to scrape that can be very, very useful. Or are you trying to like, do a project, Alvah? Are you trying to spin a product or service? If you’re trying to do product or service, just don’t do it If you’re doing it for just like an individual project research purpose, you probably don’t need a whole lot of Data just to go like, oh, I can write like, you know, like a beautiful soup pastor Or I can do like little, you know, crawling spider So I would just say, like, consider that because like even those are LinkedIn racing, like they had some Rolling’s right about like how you could approach their scraping But it’s it’s still tricky because like if you, for example, take a project that you eventually want to monetize in turn, like if it’s someone else’s intellectual property, then that will get you into trouble, like long term So I would just say, like, you know, consider some of those questions, you know, are you doing it for individual purpose? If you do eventually want to monetize it, just don’t do it, you know? Like what? What do you really need it for? Everyone else can kind of I think there’s a lot of people here who are experts who can speak to that a little bit more Yeah, this just a topic I’m just move right past you I give you a white hat hacker answer Yeah, you do that in the chat So there’s a resources post in the chat, web scraping, whatever that would just move past that. Next up is Jennifer Nurdin You have a question. So I don’t intend to start war here, but I like learning about Data I’m very much on the business end of of Data pipelines, but I’ve got a lot of databases to get to. I need to start merging some code over vacation I want to either deep dove into Python or ah, which one should I deep dove into You can give me one reason why and one reason why I should not do the other one go I was trying to do one or the other of them so the time I said I would recommend one of those. Yeah. Oh good Excellent. On the right track One of them is a good place to start then with a contrarian views Let’s hear it. And Python are the actual correct answer I have been there and done that Now I want something new one Oh yeah. Yeah I, I, I always forget the answer to this, but the way I remember it is the pythons for products and R’s for research that helps me remember because I forget when it comes to picking up Tensorflow over Christmas break So I repeated this, I would just I would just flip a coin. Right Heads or tails Five one. I mean that the holy thing are kind of silly, in my opinion I mean, I, I’m more of a python guy and we’re getting Python like most of the twenty, twenty two. So that’s the interest in Python, but said are great too I think the languages are just utterly stupid Let’s try to flip the coin along either way Yes. I’m real real quick Important, just a real quick one For those of you Python people that are privately jealous of our shiny, you can pull it into Python. We also have cash and stream, which is like really nice I will admit our has made me jealous for many years So Jennifer, I’d say this I’d give it unbiased answer and this one has to go So take a look at, you know, maybe I don’t know if it is for your own just personal development that you’re trying to code, but take a look at people in your organization that you’re close with around you and see what they know Right. Because if you’re in an organization and that organization highly favors use of R, that means you’ll have a lot of people that you can tap into and be like, Hey

I’m stuck on this thing rather than having to post it on stack overflow and know search hours and hours, you just go to somebody like, hey, need some help, help me out So I’d say that just take a look at people around you, people that, you know, you can reach out to, who knows some programing language and just see what the consensus is among them. It is pretty balanced And that’s why I’m like, well, I got to pick one I only have a couple of weeks Yeah. And I think that kind of points to the like hour versus Python being sort of like one of those academic debates because like at the company I was that which wasn’t big But, you know, it’s like around a couple of thousand people for our science team of like 30 or 40 individuals And we’re serving different teams It really was split 50 50 And like all the managers had to know, like, ah, in Python Right. Because sometimes we would get code from like a different part of the business and then we would need to adapt. And so I was like, oh, like, you know, I can’t do this So the manager or the team leader is still usually on point to like help like mentor and guide that sort of translation I know for me I started learning are in like academia when I went out and then learned Python. But like I think at some companies I still had to do both Right. I had to like read and Harp code and then understand if issues came up like what was going on. So if you just pick one and rely on the people around you, you know, it’s you can’t really go wrong because I think honestly, there’s so many tools and like libraries in both languages that you’ll be you’re sitting you’ll be sitting pretty on either one. So we’re going to do this I’m going to actually just fire up a poll and everybody can vote and and you can pick whoever the winner is. And we’ll just go on with the next question that we have in line is from actually so everybody, when they set up a poll real quick and we’re going to decide Jennifer’s fate So in the meantime, actually go for what’s in question is actually still here OK, you might have left, I think actually said it I think I had to leave Ok, that’s going to take a minute to recognize some of the folks who have jumped in just because I finally got a chance to go to here I just realized Ben Taylor’s here Ben, what’s up? We got Sarah Sarah, welcome back Glad to have you here. Cameron is also here Keep an eye out for an episode that I have with Cameron We got Ray in the house, got David Telo Beautiful, wonderful, amazing people Thank you so much for being here So let’s go to Karen’s question next Are you still here, Karen? Has Karen left at the camera left? Actually. So, Ashley, what is your second question? I really like this question and I’d love to hear from everybody here on this as well Yeah. Hi. Second time So what are your expectations from someone in a junior role? I’m currently a junior Data developer, so I feel like I’m kind of all over the place as far as learning goes, like I’m handling, like, you know, a couple different tools at the same time. So are you expecting me to, like, know it all or just like, you know, document stuff? So, yeah What are your expectations for someone in general? So, Sarah, I’d love to I’d love to start off hearing from from you And then after dinner we’ll go to a Shreveport’s and then and then we’ll hear from Monica Mikiko. We’ll hear from everyone, because this is really, really important question Go for it. So awesome Yeah. I like this question I guess the way I would guide someone who is junior is to not get super overwhelmed, to test out different things, but find out where your strengths are I think when I had first started, it’s really, really easy to just see the see of everything you don’t know and feel like the expectations of you are extremely high and feel kind of down on yourself that you can’t meet everyone’s expectations And so I think that also coming from above, we should be a little bit more level set on what our expectations are of junior folks and that it is so widespread and you’re not going to find a Data unicorn So I would lean into projects that you find very interesting and that you can see is very much the industry that you like, very much related to the industry that you want to stick to and focus on developing the skill sets For those I know, we Harp on on communication skills I think documentation is extremely important and then focusing on industry specific skill sets that can help you along the way So that would be kind of how I would guide you there through Butson What do you look for? Yeah, so. So if you’re specifically looking for a skill, I would say the very first an important skill that is required is a skill, because most of the time as a junior, you’re coming into a project, you will be given Data to analysis of 90 percent of the analysis can be done with the SQL

Right. Like all the way down Labruzzo Visualize and show you the output, but this girl does all the job, so that’s where you’ll be spending most of your time, right? So that’s a good start You need to go advanced into a SQL, I would say Right. Like everybody can write all the Harp to select and all But when you’re doing analysis, you may need some of the analytical functions of a SQL and all the key expressions and everything So go in detail with that That’s a good start. And as you get into the project, right, you can you can learn maybe if you know by dawn you can go into detail of it and you can learn on the job But that’s what I would see But the industry is completely polluted with all the jargon and literally everything that junior’s that That’s kind of unfortunate, but I would say that school is a good start Monica, what do you look for in a junior data scientist? Yeah. So aside from any of the technical skills and what they don’t usually put on job descriptions, are those soft skills mainly being a curiosity? So the main point of your job as a data analyst or a data scientist really is to solve problems. So if you’re curious to understand what that problem means and what you need to do to solve that problem and continuously learning on those type of skills, that’s that’s what I lean towards Because you can always improve your SQL or Python or what have you those those skills along your journey, then what do you look for? I think, you know, everybody’s brought up most of the the main points and soft skills, curiosity, usually one or two technical skills that are really important based on the business skills, obviously huge But every once in a while you’ll run into a position where you might need something else as well. What I’m going to do is partner up with a technical lead I’m going to partner up with somebody You can do mentorship from a career perspective And those are the two things, not so much that I’m looking for, but I want you to be able to develop under mentorship and under technical guidance I want you to see that you’re progressing quickly and learning from someone who’s essentially on the same project and handing out smaller tasks to you and frequently checking in with you. I think there’s got to be some sort of a framework and especially a safety. That’s a good way to say it so that you never get too far out on your own And I want like I said, from my perspective, that’s what you can expect from me But then from the expectations of you is the rapid learning I want to know that I’ve hired somebody that learns quickly and someone who’s willing to try new tasks Actually, I’d love to hear from from a listener as well What do you look for in a junior Data scientist, maybe specifically at your organization? What kind of candidate to look for? So soft skills are pretty much a lot of people mentioned is really important And we’re looking for people who are curious, ask questions, because a lot of things can be new when you just start your role So asking a lot of questions and clarifying things is really important And the other thing is networking with Otter’s talking to other junior Data scientist or senior ones is another thing which we really, really value And what do you think then are you mean? Yes. How are you guys all engaged in a minute past me on this one? Yeah, definitely. So let’s hear from a listener from from from I don’t know who wants to go, but so many people would love to hear from So, Amaia, do you have to answer this question or do you have a question? Because I have you in line at number four from now So if you have a question just to sit tight, Mikiko, what do you look for in a data scientist and then after this? Yeah, I will caveat that I think I will caveat that there are two factors that kind of impact the implementation rate of what is expected of a junior deisinger One is the company size So if you are in an early stage startup to be so the mindset right typically is if you’re really of early stage, you’re in a small business It’s like we might not be able to pay you a whole lot, but we’ll try to treat you with respect, but also give you a lot of responsibility and learnings can be really fast So I think there it’s like personality and like grit and creativity is like super important, a lot more important than having advanced technical skills There is still like the bare minimum of technical skills Right. So that’s like SQL some kind of scripting language just because once again, there’s a lot of additional analysis that you might need to do and visualization

But the personality, the grit, the creativity, the being able to go to work on a problem, you search Google for issues and bugs as they come up, but then coming back to like the team lead or to the manager if you run into issues So that’s like for an early stage, for a more established company, you might be not wearing multiple hats. You might be wearing kind of like one hat And so if you’re sort of the central focus of your work, for example, is strategy analytics, the expectation there is not so much technical, but it’s a facade off You can ask really good questions Secondly, you can kind of scope and do some kind of like time project management But, you know, being willing to ask for help if you run into scope issues Right. If you are, say, for example, the product of your work is more engineering focused, then I think like having the technical skills Yes, maybe you might not be as up to date on best practices Maybe you might not be up to date on like Scrum or Agile or anything else But those are some things that we can kind of like teach you and show you And if you’re doing research once again, it still goes back to be like you don’t seem to have advance knowledge. You just need to have just enough to be able to do the work but still come into it with that sort of being ill, ask questions, being able to talk to people, not isolating your business partners, all that good stuff The the technical bullet points, honestly, are like SQL some kind of scripting language and then some way of documenting your work and communicating it And it could be visual, like a bi tool It could also be some kind of like document management tool I mean, that’s pretty much that’s pretty much what I look for So, yes, I definitely think there’s something to be said about having a collaborative mindset and approach when you’re dealing with stakeholders And that’s definitely something I would look for in a juror data scientist, given that they oftentimes are having to translate a lot of the heavy business problems to a technical solution So definitely that’s something I would look for as well as the ability to communicate with Data recently So has someone who’s very analytical, who knows how to utilize different types of data visualizations or setting up dashboards in such a way that it helps translate recommendations very soundly so that your stakeholders are not confused of what to do with the information that’s being said I definitely think that’s a great way to help the teams who are more advanced people to know where they spend their efforts So I definitely think that’s a good roadmap to helping to drive someone’s career as well So that’s something I would look for More on a soft side, as someone who was interested in being data science, it’s wonderful to have so much great advice Great tips. Ashan, I hope you’re taking notes Even if you weren’t, I transcribe this so it’ll be all there for you next up First of all, cheers everybody I hope everybody has their holiday beverage I’ve got a cranberry stout getting really festive with it So thank you guys so much for hanging out Next question we got up is from a Lolita Are you still here? Then after Lolita will do Eric and Greg And I’m a little circle back and see if they’re still here The next step leader still here Hi. Hi Yeah, I’m here So it’s my first meeting and I’m like, I will introduce myself I’m a graduate student at University of Minnesota and I’m like applying for full time opportunities and Data signs and machine learning goals But I have had hard luck so far Like, I’m not even getting calls from the company And when I apply on LinkedIn, I look I go to that job description and I feel like this is the right goal for me But even then, I don’t I don’t hear anything from the recruiters, from the company So what advice would you have for a person who is just like want to enter in this Data science community? Like, I have done academic projects and I have a professional experience of more than two years, and I’m also doing an internship with a company over here in the U.S So I think I have a pretty good background But still, I’m not doing anything so far in the industry Yeah, definitely. That’s a great question So I’ll chime in with a couple of bits of advice And I’d love to hear from Sarah that Mikiko then Ben I would say I would say this I would say first, make sure that when you apply for a job that you don’t kind of just

like apply for the job and just That they’re going to call you back. There’s still work to be done after you hit the apply button, and part of that work is to let people in the company know that you applied for the job, whether that’s going to LinkedIn and trying to find somebody that’s a technical recruiter to then shoot them a message and say, hey, love what you’re doing with the company. They thought it was amazing I’m so happy to talk about how I can help you contribute positively in this role I’m just like making stuff up above my head But you got to get the picture You want to make sure you’re actively letting people know that you’ve applied for the role. Now, if you can’t find anybody on LinkedIn that’s a technical recruiter, then maybe try to message a data scientist and just let them know that that you’ve applied for the role and and that’s about it I wouldn’t I wouldn’t ask for a recommendation or anything like that Next bit of advice, I would say is just be persistent It’s a numbers game Any application that you submit, you need to assign a higher probability to even getting that job at like less than one percent because you’re one of maybe a hundred thousand applicants, maybe. So just keep opportunities in your pipeline by just applying, applying, applying and then following up and following up So let’s hear from Sarah, Mikiko and then Ben Ok, I guess the first thing I would say here is that I had done a talk on networking, which, you know, forging relationships in the community is extremely important The timing, though, is also important when you create those relationships So I did a talk during Keith’s Data science conference that talked about the importance of the of those relationships and when they when you can actually put them into play And so you’re you’re currently in the stage where you’re looking for a job on the job hunt. If there are connections that you’ve already forged with people who can help you in the stage, I would leverage those in the case that you don’t have that I would say creating I don’t want to say like noise or chatter or your own personal brand within this community can also be helpful for people to see your profile Since I started, I think my first job was the only one that I actually applied for Since then, I’ve been approached by recruiters, which means that I’m now on their radar And many of us who are active on LinkedIn, engaging with the community, putting your name out there, having a personal profile that people can look at what you’ve created and what you’ve generated and be interested in having you on board to what Harpreet Sahota said I think also it plays into what did you mention that I liked? You said something really good Now I’m forgetting But yeah, just forging those relationships, talking to people within the company and shoot, if I remember it all, I’ll come back definitely solicit from Mikiko and then then And then after that, let’s hear from Tom Yeah. So like Carlos and Matt have kind of sit and chat Right. Parvati’s about your sales pitch So for context. Right For, for my particular background Like I only have the bachelors and like anthropology and economics I did like a boot camp that spring board, but for the most part, like I think by of measures would be considered very uncompetitive or unviable for design skills But for me personally, like the way I was able to leverage it was first off, I had sort of domain expertize and specifically in sales, marketing, revenue operations, but also I really working with sales teams a lot I learned just really how to develop my pitch What is your unique value proposition? And I can’t like I hate to say it cannot be you have a PhD in acts, you have a master’s in Y, you did your undergrad and see you depending on what kind of degree you’re going for research role and you have published papers and all that, like for Google Brain or Facebook research, then that’s that would work for them But for example, if you’re applying like Data scientist roles that are closer to strategy or like senior analyst roles, they actually don’t care as much about your education They tend to care more about your Problem-Solving abilities and how you can demonstrate that you push the needle. And so they care about if you’ve worked with product Data, if you’ve worked with sales marketing companies So one thing I would say is that, first off, a really kind of understand the kind of roles you’re going for. Right Essentially, there’s three sort of personas within the science world It’s you’re either doing engineering work, either are on models or Data pipelines You’re doing some kind of research work or you’re doing some kind of

And Alex, you know, they have their kind of things that they like to see, so I would say like figure out which of those roles you kind of want to go for So definitely don’t spring break then Secondly, develop your pitch such that you are focusing on your unique contribution or what you could bring to that role You know, Harpreet and a couple of other managers Street like to talk about your superpowers, your superpower It doesn’t have to be technical Sometimes it can be For example, you’re a really great facilitator between, you know, totally different teams. That was my pitch, was that I understand sales marketing for companies in early stage startups and both engineers and sales teams like to have room, which apparently is rare enough, but that’s for people’s ears up So I would say focus on those two things One is interesting or three One is understanding the roles and work that’s available The second part is understanding which of those roles and what what type of work you would like to be doing. And then the thirdly is understanding what are sort of your unique value propositions that you can bring when you are tailoring your resume and your CV. And this is where getting other AIs to look at your resume, like, for example, Carloss made that offer, can really help with bringing that out Then I’d love to hear your perspective on the second part of her question, which was essentially she looks at these job descriptions and she’s just overwhelmed by the requirements. What are your thoughts on that? And also, can we please hear the story behind the hashtag on recruitable? Yeah, hopefully this isn’t too unflattering So really cool people I’m at Brighton, Utah, the smell right now So it’s it’s good Like, it’s got the powder It’ll be good. It’s about 20 minutes more so people don’t know right out write job descriptions. And so the joke the joke that people throw out there is I want 20 years of deep learning experience or like something that’s stupid just doesn’t exist or will actually see requirements where it’s impossible to fill that need So imagine like I want another expert and a python expert That’s super rare. If I had the big Data stack there, that’s even more rare So I don’t think too much into descriptions I really like what people have been saying about talking about your brand so professors don’t know anything about they really don’t know that much about getting you a job They know how to teach you, but they don’t know how to get you a job and they don’t understand branding. So I tell people, go give a presentation, they’ll give a presentation at a meetup once, help build your brand, get yourself out of your comfort zone. The other thing I wanted to say is if you’re normal, I’m so motivated to not hire you that I want my team to suffer And if you’re if you have curiosity or passion, curiosity, that’s great But if you have a passion, that’s huge So so get that passion and show people you have it under a critical story I was just sick of getting hit up with recruiter spam, which everyone on this call has been hit up with recruiter spam, where they want me to be a Java developer or something that I’d rather just see what’s next after this life than a Java developer So I put that up there. Think it would help? It didn’t help sell in there Get on the list. The comments are from blowing up But just for the record, answering That’s while on a ski slope. Yeah, that’s that’s that awesome tips and awesome place I want to be answering this while I’m in the air I’ll be like five on something stupid Jennifer, so far, Python is in the lead Thirty three to six There is still time to vote if you haven’t already So who else wants to share some insight onto this? Um, so the issue here is she’s got a couple of issues she’s applying to to companies and she’s not hearing back and she’s getting kind of intimidated by these job postings Monica, what do you think? Sorry, I was typing in the comments What is the question? So is she’s been applying for roles and just not hearing back, but she feels like she’s got a solid background for the roles, like there’s seems to be a match between the work she’s done in grad school and the job that she’s applying for But she’s not hearing back And there’s also when she’s going to go apply for jobs, she seems like the that the description is just like, what the hell? That this is insane Mm hmm. Yeah, I don’t have much to add Everyone’s brought really good things to the table as far as reaching out to potential recruiters on LinkedIn If you see somebody that can help you out or even if they’re not a recruiter on LinkedIn, if they just work in a department that’s related to where you had applied, you can ask them maybe if they know a specific recruiter and kind of just weave your way through to find the correct content just all about

Networking reaching out, don’t be don’t be shy about it, the worst thing that they can do is say no. So that’s what I would add So I think somebody who has some valuable and say, here’s Joshua Bastion, because we answer questions like this multiple times, we had this dream job So this is something that I know you have an answer for So go for it, John. Right So, I mean, there was like a lot of great answers before me But if I could add just one thing, just remember that this is just basic human psychology. So you’re trying to develop a relationship with someone and you’re trying to sell yourself to someone so that they will consider you position So if you’re just like on LinkedIn and you’re sending your resume and you’re just, you know, sending messages to recruiters saying, hey, you know, pick me, well, that might be good, actually, because, you know, if you apply for a position there’s like a hundred candidates, then you’re no longer just a piece of paper You’re suddenly you’re someone you’re an individual’s connecting with someone else And actually, that could help you out just being picked out of the bunch You might not get hired, but at least you will be considered or you could be considered because it’s all about maximizing your chances of being picked So that’s one thing Now, when it comes down to creating to developing your networks, I mean, I agree with everything that’s been said so far The only thing that I would add is and, you know, we hear this all the time, people saying, oh, you know, I send an email to to that person and they didn’t get back to me Well, OK, but what did you do? So basically, don’t ask for a favor unless you have something to give them And if you don’t have anything to give them, which is probably what’s what will happen and it’s totally fine, start developing a relationship And a way to do that is to it’s to, I would say, bring value to the other person So if you’re let’s just say I want to connect with, you know, let’s just say I don’t know, Harp and I want to connect with Harp LinkedIn What I would do is I would go on his profile, see see his connection, see articles they have talked about You know, I can figure out that he’s actually a host of a podcast So first things first, you know, just send a message saying, hey, you know, I just saw that you’re hosting a podcast This is awesome. It looks very interesting And that’s it. Don’t ask for anything Just create that relationship And then maybe a week later, come again, bring something else and then, you know, start trying to develop some sort of a connection in which you can actually start to exchange with that person. And then maybe you can bring up some some concerns, say, hey, you know, I’ve been looking at this position in your company Would you be able to to help me out or can I do or how can I do something to to to to to show that I would be a good candidate So this is probably what I would recommend Just, you know, just remember that you are like in a relationship with someone and just remember that you’re connecting with another human being that had something real quick Yeah, definitely go for it So if you’re looking for a Data science job and you’re not getting anything, maybe you want to switch or change the strategy a little bit and maybe you want to focus on these companies where in their culture and Data something like Amazon, for example, they really encourage people to move to different positions So maybe what you want to start with is a Data analyst position and then you move after because what the are usually looks at is how long does it take me to train that person to gain some some knowledge in any area that I’m hiring that I need help with? So if they feel like that training time is long, you’re not going to hear anything back Also, it’s all about, you know, like everybody else’s is about selling yourself Right. What can you do for yourself in terms of domain knowledge? And how can you back that up with with sound Data to showcase where you know what you’re talking about? You solve this issue already and things like that And in two young ones, time is your best asset right now So there’s there’s nothing that tells you in the book that you have to start as a data scientist like Amazon who wants a data scientist who’s done so many things that maybe is good for you to start somewhere else and then work your way into it And now somebody will look at you as someone who already has, you know, a lot of the area. So I said I’m going to, you know, so so they would look at you as someone who

already knows. A little bit about the company that can be trained a little bit faster and get up to speed faster So look at different strategies So don’t just look at I want to be a data scientist and get a lot on it Thank you very much, Greg. So the last person to hear from on this question is Leona, because I know you kind of went through a similar journey as Lolita has gone through here. What tips can you share with her? And then after this, we’ll get to Eric’s question and Greg’s question, then Amyas question and then sort of I think I love the question, but I need to I was in the same position that you are right now Almost two years ago, it was really hard for me to get any interviews I was applying on LinkedIn trying to talk to the LinkedIn, but I found this conference which was related to my field plus Data science My field is economics and I just put my resume there There was supposed to be a job fair in that conference and I got some interviews there and I got my offer from IBM from there So the point is, if you can’t find any conferences, either your advisory firm or you yourself, Tweeter, Twitter, LinkedIn, anywhere you can find, just try to attend If I believe a little bit challenging with Pandemic not talking to people directly versus when you could go to conferences, but still consider that I got my job that way So, Lolita, lots of great advice there Don’t worry. This session has been recorded There will be transcripts and you will definitely have access to these answers Hopefully that helped you out And good luck in your job search Next question. Let’s go to Eric and then just preemptively for the answers for Eric’s question. I’ll go to Dave and can be cool So I have one I wanted to throw out one quick thing to lovely death as a fellow job search or hear something that’s worked for me. So this is and I didn’t make this up This came from Reno Perry, who’s way smarter than I am So go to LinkedIn, go to the search bar, just press enter You don’t have to search a word and then go to content and then post it in the past twenty four hours or past week and then go to companies that people work for and type in the name of the company you want to work for and see what people have posted And I have gotten two interviews from people posting, hey, we’re hiring DM me And so I send them a message I’m like, hey, saw this job Look, I know and I post on LinkedIn, so if they want to look at my profile, they can see that I’m there, I’m active, I’m a real person and I’ve had I’ve had two interviews from it and so, like, it works So try it. And then you’re actually talking to the people that you want to talk to And so there’s that So my question is, I had someone reached out to me on LinkedIn to talk about a project and so basically said, I have an unlabeled product Data said I wrote all this down so I can remember it So I have an unlabeled data set with thirty six variables that seem to show correlation into kind of six major categories I want to use the data to identify similarities and hopefully create clusters or profiles of those groups. I’m kind of I want to label this unlabeled data basically Right? So I use latent factor analysis of Verolme exhortation to create six factors for kind of categorizing the data They didn’t come labeled as being six categories, but the rotation works So two quick questions One, does it make sense to cluster on the scores of each observation for those factors? I think it makes sense, but I just wanna double check And then my second question is, can I use using this model now? I guess that exists. Can I use that somehow to score valid or not score, but like, I guess or can I use that on validation data to get scores and assign a cluster or group to those new observations like on that? Does that does that make sense? Like I’m trying to figure out if I can use that predictively So the short answer is yes The long answer is it depends because there’s no such thing as the free lunch, right? Sure. It might work into my not So what I would recommend doing is if you are interested in using clustering to create labels and then use the labels with the original Data, then create like a classification model is what I’m hearing Right. Because you’ve got six distinct labels You can certainly do that. And what I would do is I would incorporate as many additional features back in when you train the model that you’re trying to do as you can and then try and use a more sophisticated algorithm that has nonlinear boundaries Ideally, because what I found typically is when I tried to do this in the past, I tend to find things like random forests or boosted decision Trees tend to work better because they can form arbitrarily complex decision boundaries based on the nature of the algorithm So that’s what I would do And then, of course, the

Problem is going to be, is once you train it, it’s hard to verify what your generalization here is going to be, really So I always keep that in the back of your mind that, yes, you can do it and it will work sometimes and sometimes it won’t And generally speaking, what you want to do is you want to factor in as many inputs to training the classification model as you can So what I’ve done in the past, I’ve actually used multiple clustering algorithms and then use use the for example, then use create different models for each one of the clustering algorithms and then see if I could use them as an ensemble together to then create the final predictions for the final data set Well, that’s super helpful Thank you. So would anybody else like Chairman Cam or Van on that one? Cam, if you want to go, go for it Well, just happening here, a lot of what I was thinking was actually along the lines, what David was mentioning, just stress testing, different types of cluster analysis to see what works like pass I’ve had to cluster different types of personas and have them to trust us canings or scheme androids or chemos chaplains trying to see which one is the best of those lots and then really going from there But I agree completely with what David was just mentioning, the sound approach here then Any tips? I call at the end of that the questions I’m sorry, I didn’t get the whole thing, so just give me on this one Sorry. Yeah, no problem Eric, do you feel like your question was answered satisfactorily? Yeah, that was definitely helpful I hadn’t really I hadn’t really considered taking the cluster definitions and then backing that back out to just using as all of the factor or sorry, all of the variables instead of just the factor So that’ll definitely be my next step that I’m in So, everybody, thank you so much for hanging out, sticking with me while I get through these questions. Sorry if I have not called on anybody in a while Peer input is always welcome Feel free to just limit yourself and jump in on any answer at any point Next up is Greg subquestion Greg eager to step up again So Harp it No, no, I tried to stop, but so at some point in my career, I would like to have run a company, create a startup And I have a short term plan in the long term So short term is written in the next five long term, ten plus And I’ve been reading about Federated Learning One of the things that I want to attack is distribution We don’t have shortage of resources like food We have a problem with distribution Does anybody know a little bit about federated learning? And if it’s one of those things where the big Data Gobbler’s are only the winners or can start a startup bank on this? So I know something like Google started this idea of having a model train on your Data on your device without taking that Data self to the centralized data storage and to do things with it Right. So in this case, it’s partially Ghedi PR sound, but there are still some concerns that the model could still memorize that edge devices Data So I’d like to know, how is Federated Learning moving now with some use cases that you know of, that I can have a better understanding of it if it’s the solution or part of the solution behind the possibilities of fixing some of those distribution issues that we have. Can I take a stab Harp? Yeah, I think over time after time, I love this question I’m not shocked that Gregg would ask it So this is actually brilliant It’s actually a form of integrating brilliance because you get to keep people’s data private. You get to train on that data But it’s a hive mentality, too, if the parameters can be shared, well, that that’s still keeping your data private And this is one of the tricks and reinforce learning models to suffer, of course, from there, trying to reduce reduce the curse of dimensionality That’s why they don’t train on all the data and that’s why they’re so popular when the data gets too big So the cool thing about this, Greg, is the the model parameters can be set back up to the cloud. And there’s kind of a high learning going on with balancing the parameters It’s almost like an ensemble approach, just high level conceptual speaking And this way all the models can benefit from the small sample subset But then there’s also the balance of this Well, you’re trying to narrow down on that specific person

So the art of Getting that right would take some research, but that’s just my thought Any thoughts or Carlos Reven on federalism in Congress wanted to throw out some key words that are learning. Is there useful, Keyworth looking to like edge computing and transfer learning whether it’s only going do but at a high level? The idea is, like you said, like you, it’s very dangerous to be recording data from users in a way they don’t understand And Apple iOS teens coming out can be a whole thing over the ability of people to block Data it can have a huge impact on the apps use to pay attention to that The idea with better learning is that it gets you out of the problem of a criminal record. Someone’s audio, ship it up to Google Cloud They’re going to translate it for me and then give me the results, send it back to me Federated Learning would put kind of a stock model that’s transferred from the cloud model on your local device, allowing you to have local model learning on a device which is extremely useful because you get like Tom said, you get out of this idea The dimensionality no longer are they trying to understand your words in the context of all of the words that are similar to your sound They just get what you said, like you said this and you didn’t like the results So they clearly don’t understand you So there’s a lot of benefits, better learning around Like, OK, I actually like it actually more accurately because it’s just like this one guy. But I’m not gonna go too deep into I do recommend you look into the intersections of Block Channel and I because Federated Learning and as computing and block chain, there’s a lot of people in that space trying to fuze all those together for those concerns or the privacy concerns. But the GDP concerns for the actual just computing cost of storage and accessing Data at high frequency in the cloud So it’s a huge space It’s going to blow up Look at those keywords So what do you want to try to I don’t know what kind of phone you have Like, I’ve been talking around quite a bit with our kids for Apple, and I think that’s a really good way to understand how Apple is handling stuff like Federated Learning So, I mean, embedded in the augmented reality kit is also machine learning models, computer vision, language recognition and so forth And the fact that you can operate on this, a couple of year old iPhone, very sophisticated models, it is really cool So I would say it again, depends what if you’re using your android, get the equivalent, but you see our core But some interesting way to sort of segway into it in a way that’s kind of fun You can check it out. It’s interesting conversation because on one hand, you have to be learning, going on or distributing machine learning On the other hand, talking to you last week about this And it’s like you have almost a mega API thing going on right now where you have three Right. And so you’re having these like two almost polar opposite worlds where on one hand you’re sort of this all encompassing NLP API And on the other hand, you have individual and also but great, great insights that that’s probably what they use all these things for the case For that then any anything to add to this conversation about Federated learning to dance around and and trying to figure out how I can say this I don’t know if I can You’re in really, really interesting space I think you’ve already figured that out from everybody else comments I would agree with everything that’s been said, but the majority of the points that you really want to look at, especially lockshin Social want to say would look at lockshin and think about how you would maybe use lockshin to understand what your Data has been doing, what it may have done before it got to this point And where sort of a metadata tag you are just thinking about watching that way and seeing if you can embed a whole lot of data about your data in a block chain in a secure way so that you can have access not only to a particular data point, but also to through some of the transformations and things that happened to that data point along the way as part of is sort of lockshin ish concept There’s some other people out there that have written about this, some not It’s not coming home at the top of my head right now Like I said, you’re in a very, very interesting space and a lot of these answers are really, really good answers And like I said, the block to piece of it would be something I would look at as an important component of privacy, but also knowing a little bit more about your data itself, not just the individual data points, but maybe a bit of the journey, the data behind that journey and the provenance just I think here’s under and I can also add some blocking stuff. And so not under your NDA The idea there is that like when we do when we create models or creating them with certain data under certain parameters on certain devices with certain input And what block can you do is that lets you keep a permanent record of all of those steps

and all of the. I kind of dependency management in some way, and what would I like to do, though, is you have this metadata that also the second layer of transfer learning so that on the same device, in the same conditions, you have a slightly different input Well, like Tom said, you know, you’re actually you know, you’re transferring parameters without transferring data and it lets you get around a lot of concern on paper So definitely dove in on the block chain And I’ll send you a paper on block for that Details how all of this stuff, what like how the block chain secures the status of devices and stuff that’s recording it in a chain Super. Thank you. Thank you, Joe Thank you. Thank you, Carlos I appreciate that, Ben Thank you, Tom. We had you on spotlit here So while you were making our way down the slopes, we all got to join in on that Ben, do you have anything to add about, um, uh, Federated learning? He’s probably doing himself on the slopes Um, I do really quick So better at alerting you became top of mind covid hospital networks were unwilling to share their data So Utah actually, I talked to a senior health informationis They literally said enough people have not died And I feel that they didn’t understand the disease And to say that out loud sounds so stupid They weren’t sharing their data They can share it because it happened So, yeah, the learning needs to happen There’s superimportant stuff Happy, happy to send people Avadon Provisional Patent I filed on tokenized kind of in the spirit of anonymizing learning Happy to send it to others interested OK, that’s it. Sorry for the distraction Oh good man. Thanks. Greg was there We had one question. Was that, was that satisfactory? That sets me in a nice room So I appreciate the answers here Thank you so much, everyone Right on. Next question we got up is, Amaia, are you still here? And again, thank you guys for being so patient and waiting for your questions to come up after me. Oh, that sort of And then we can open it up or we can call it evening because we’ve been hanging around It’s been awesome. But I mean, I go up and my first of all, am I saying your name right? Yes. Yes Thank you so much. Thank you so much Hello, everyone. My name is Amal This is my very first happy hour So very excited to see all of you and really very much inspiring And thank you for all of that So quick introduction I’m a scientist at Dow Chemical or last year, and I would like my question and jet as well. But just to say it again, for last year, my team is using SAS So we are like a sex shop starting like from like last month My company has decided to actually transition from says to Achuar and we are learning about it. It’s a very different landscape Access is prepackaged with a lot of things And I’m learning like in adjure, there are a lot of manual things we really had to do The company is going to use Python as a development language So I’m just kind of like frustrated and struggling with how to switch from the SAS mentality to like now this cloud technology and a landscape So if you have any advice on how to transition and how to change that mentality from SAS College. So I think first it’s get good at Python before I mean, you can probably simultaneously get familiar with cloud technology But if you’re doing everything in Python, get familiar using Python to make the transition from SAS to Python myself When I was about statistician’s SAS was the language that we used for everything And I was in that role for almost five years and I made the transition to Python and it wasn’t too difficult If you Google, it’s something along the lines of pandas for SAS users and like on pandas actual documentation, they show you the PANDAS equivalent operations for SAS code and you can kind of easily pick that up One thing you could do is, you know, you you sustain, you’re good with it You know exactly what your output should look like Great. That’s a baseline That is a the comparison for you Now you can learn Python and recreate all of your work in SAS in Python and check your answers and see, OK, did I get the same output? Was that exactly what I was expecting or not? I did a whole bunch of that and I just got really, really good at the python And Pat is really, really quickly Um, so I’d love to hear from Dave on this as well Mr Microsoft. Go for it So just so you guys know me and I have a little bit of history He was a student in one of the boot camps I taught at a former employer for Data science What’s up AMEA and I look a little bit different My hair is longer and I grew, my beard is gray now OK, so first question

I would ask is, if you move in Asia, is are you planning on using the Azure SAS offerings in the cloud, for example, Azure machine learning? Because if you are, they have a drag and drop based interface The code behind is actually what’s known as TLC into some internal Microsoft framework, which is all based in C sharp, not surprisingly So from that perspective, if you’re relying a lot on azure machine learning, it’s not really a one for one transition from SAS to Python, because ideally you’re using all of the uplift that you get from using Azure machine learning, which is not going to be Python at all. It’s going to be this kind of drag and drop So it’s more akin to Enterprize minor in a way So I guess would be my first question is, are you guys really planning on building everything from scratch and python in Azure? Are you planning on relying on the services? No, everything will be mostly building from scratch So writing like Python calls, like all the pipelines, Data pipelines, the gold and then the deployment pipeline Okay, well, there you go So so what Harpreet Sahota is, is gold You need to learn python Yes, of course I’d be asking. I mean, I might be asking, why are you moving? What’s the what’s the benefit of moving to Azure is just for the company Strategies are always moving to cloud and is expensive and already has with Microsoft on all of the different systems So they are asking all Data scientists to monitor Yeah, that’s interesting So that is the former enterprise architect I’d be like if you’re moving to a platform like Azure, why are you taking maximum advantage of the stuff that is going to say? That seems kind of odd I mean, there’s a lot of great managed services on Azure that you might want to take advantage of, especially a data scientists as your studio, for example Awesome. And so, yeah, everything from scratch I don’t know what the advantage would be It actually might actually be more expensive because it’s mostly the cloud, right? Yeah, most of the use because it’s on hierarchical forecasting and it’s forecasting so far that we would have to write like all that model I don’t know how that would drag and drop the model yet I would suggest using I mean, every cloud framework now adds the ability to write a model, deploy it. And I highly, highly suggest you it with the structure of the cloud And much like if you’re using the class or something crazy like that would be weird Yeah. So. So, for example, to Joe’s point, as you’re told, it allows you to say, look, I want to use all the managed services except for right here and I want to put my python code right there. And you can do that And that’s what you want to do if you can OK, yeah So Matalote, right. We worked with us and you can see the reaction and the mistakes you see people make when they move into a cloud environment as they try and replicate what they did in the cloud And if you try and do a model to model that sort of setup, it almost inevitably means it’s going to be more expensive infrastructure running 24/7 and just some more expensive if you run it that way, trying to do things that cloud native way, I would suggest reading the docs or maybe even getting a server or something And I think they have the eighty nine hundred balances Understand, like how azure you to stuff and then do it that way they’ll make your life a lot easier. The biggest trouble is you’ll notice the cloud of data say, oh, I know what I’m doing right obviously And then they get along really well Why am I suddenly saying all this money? And so each cloud has its own way of doing stuff starting at Oh, just going to say, I mean, generally with managed services, you’ve got to watch the costs through. Some of these services are quite expensive, but generally you’re going to save money on the Ottos gaming aspect and reducing your operational overhead like your Data ops team can just be smaller if you’re not constantly managing these instances or spending them up and down just happens automatically behind the scenes So the way I always when I was back in my architect is what I always said was, if you need the same number of software engineers when you move to the cloud, you’re not doing it right and go on Awesome tips there. So hopefully that’s enough to get you started Thank you, Eric Thank you for for reminding me I saw this actually pop up on LinkedIn earlier this week as well Matthew, Vltava in the House just started a new job with I believe Is it Brinks? Yes, it was Brinks home security, not bricks that the bank AIs brings the eye making sure people don’t get into your house. Guys do that Is that that is awesome, man So congratulations, everybody helped me And actually, that is that is awesome because so much What was your journey like, man? What was your journey from from when you first heard that you wanted to get into Data to finally get this job? Well, basically, I started out in digital marketing, building our Facebook campaigns and Google when that first when that first happened So I mean, it was more questioning So, I mean, first it was a OK, so why do I have

Cpi’s here, and then it was OK So how do I set up these KPIs and then there was like, wait, is there are some statistical significance to these KPIs and what can I dig out of this so that I can it’s just always been a what can I learn new and what kind of insight can I pull out of this? So basically, I would say it was a natural progression because of curiosity And so your role as a data governance analyst So talk to us about what what that role kind of like what what what is a company expecting you to take care of? And did you have to, like, train or learn on your own to to be able to come up with the knowledge base for this role? Well, I mean, as far as like the knowledge base comes for this role, I did have some Data governance background mostly I mean, background with me It’s like ever since the pandemic started a little bit before the pandemic started, I did work with startups and I was a consultant So most of these are medium to small shops I mean, they’re not big, big organization So I’d be in there not only doing the data analysis, but I would be also working with retail. And part of the problem, the part of the thing I found out is a lot of you guys have probably experienced is the Data governance I mean, inconsistent tables, poor, not not very built out etel documentation So, I mean, it was mostly just experienced from over over consulting and working with startups. As for the actual job itself, it was it’s more of I’m sitting in the middle I’m not I’m not the data scientist, I’m not the data engineer or the data analyst But I am the guy who’s there trying to work with multiple departments to standardize definitions, make sure Data integrities correct And then that annoying guy who shows up in your GitHub in your comments saying, do we actually have this KPI? Correct. Is this right? I’m kind of in the middle there and I’m working I’m in the middle between different departments That’s cool, man. Congratulations on getting the new role that I’m sure everybody here is just as happy as I am for you That is freaking amazing, man Great job. Looking forward to seeing you move along in your career and continue to do great things, man. So let’s see if anybody else has questions I’ve gone through the Q here I’ll open it up again Thank you guys so much for sticking with me and hanging out and trying to get to everyone here. So hopefully you guys all had an opportunity to to provide some assistance for right now. Let’s let’s roll it back up Does does Akshay still have a question? Because I think you had a question earlier, but we missed it So if you have a question, go for it Okay. I see you right there Okay. Hey, I’m not sure because there was like talk shows in the chat site that was me for the question or somebody else But if you got a question, man, go for it I’d love to hear it. I did have a lot of fun coming in to the call But being part of this call, a lot of those were answered So thank you to everybody This is my first ever podcast and then into session And I’m impressed with the kind of responses and experiences everybody brings So I’m looking forward to continuing that Right. Um, and well, thank you for dropping in and thank you for joining the happy hours And make sure you log in to my actual podcast Listen to my other episodes You got a couple of weeks to catch up before I start releasing new episodes, so tune in Plenty of time for you do that Anybody else have questions that we didn’t get to? I’m looking at either Greg or Juan or in shop I want Harp. Yeah, go for it Yeah. Hey, everybody, it’s great My Fridays have become more memorable with these sessions I’m just loving it. There’s so much to gain Coming back to my question, what transition, a transition does it take to become a data scientist from data analyst? I mean, I keep reading and listening People are saying there’s not much transition, there’s not much of a difference But why why are there two different titles for this one I’d love to hear from from either Monaco or Giovana or Mikiko on this one Let’s start with with Monica, then we’ll see if she has any thing to say that MIKIKO Well, I guess it depends, depending on what the specific job roles are, because I’ve come across data scientist positions which are truly data analysts And I think the distinguishing factor is really those advanced analytical techniques So machine learning or NLP or any of those more advanced versus just your fundamental statistics, finding trends and and anomalies and such, that’s not a very, very high level. There’s there’s probably so much involved And it’s it’s kind of they blend into each other very much Yeah, definitely. It’s a great question So I’d love to get a lot of people’s input on this Do you want to go for it? And after Juvonen, let’s hear from from Ben

Yeah. It’s always dealing with Data at the end I think that 80 percent of our time is Data clean Data Data to see the correlation, all these amazing things to know the story behind Data So I think is everyone has to have the data and that is a skill And sometimes people ask for data scientists But at the end, a lot of people is working as a data analyst So I think if one is inside of the other one, so it depends But I think it’s a very good start to start it to learn about data analysis So I think it is important to know everything about how to handle data because to build a model if you have done this first, but, well, your model is going to go well and if you’re the predictions in a good way So, yeah, after actually I’ve got to ask Eventa to chime in here because we’ve had conversations about this one year on my podcast and a couple of weeks ago when you’re in a happy hour as well I’d love to to have you break it down for us because I love the stance you take on this then as you can hear me Ok, well, then I heard Ben I’m sorry. Oh, don’t say a clip The silence on the actual pothead episode because I appreciate it, Ben This is actually something I’ve studied is the misclassification of people using the title jobs. The data scientist job titles are actually horrible, absolutely horrible labels, and we consistently use them So if we’re talking about data analyst, the difference between a data analyst and a data scientist or a data engineer and a data scientist or a machine learning engineering a data engineer, you’re talking about a classification problem And like I said, the labels terrible The reasons why people will say, hey, there’s no difference between a data analyst and a data scientist or some people even go as far as saying there’s no such thing as a data scientist. So I’ll just go over glorified data analysts And it really just depends on what you’ve really interacted with as far as skill sets capabilities, as well as what sorts of results that you’re used to getting from data scientists or whatever you call data scientists So when you hear people talk about data analysts, the difference between data analysts and data scientists, data analyst is the second most frequent job that comes prior to having the job title data scientist, because I think no, it’s it’s simply what is most commonly called the pre data scientist or that role before And so when you hear a lot of these myths, which really hearing is people talking about jobs, I don’t understand very well And it ties directly into companies not understanding very well what data science is Is it different from machine learning? Are the skills required to do deep learning, different skills required to Data science? And companies don’t have a good point of reference because they haven’t seen any of the stuff in production and those who have seen very little of it actually be effective or functional in production And if you want to understand what to do, the scientist is and how you’re going to transition into the field You have to look at a couple of target companies, a couple of groups that you would want to work with looking at the data scientists, machine learning teams are working on Look at what they’re actually accomplishing, which would look at what’s getting into production and what’s working, what’s making money And he thinks they put on a quarterly statement Anything that they say. We have booked revenue of a real value, like they put a number on it and then they talk about it being tied to their machine learning efforts And this is really rare But if you look at that level of specificity, you can see how very few companies have come to terms with monetization And that’s really the you start at the end with we’re making money and then you start working backwards to these are the actual people we need in order to continue to make money and to make more money on machine learning And so if you want a career in the field, follow the cash, look for any sort of skill that you can use to create a tangible outcome And when I say tangible, you’ve got to get beyond the buzzword of machine learning or data science and beyond model I mean, what kind of model are we talking about? Marketing impacts or are we talking about business cases around pricing strategy? Are we talking about decision support? And so you really have to dove into your career and where you want it to go What niche of the field will help you to build value working for the types of companies that you want to work for? That’s the only way you’re going to get there is to sort of be smarter than the employers and the people that are trying to hire you right now because the

People that want to hire you don’t know why they want to hire you You want to hire five years of experience in 18 years with the technology that’s two years old. And this is where the ridiculousness comes in is again, back to that misclassification. I can call I can call my dogs to decide Does it mean that they’re creating, you know, five times the value of their salary for the company? I’m Pamela Tibbles I mean, you know, and that’s really that ridiculous That’s the point that we’ve gotten So if you want a good career, forget the job title Glouster skills that will allow you to create, build and add value and really get really good at creating specific use cases within the business for machine learning Whatever that machine learning looks like, it’s just analytics are really good at tying what you just did to a person That’s really where machine learning is going, is that dollar sign of being able to monetize and be clear about how you have used Achievability, not just a word like Python, but you’ve used the capability to build something in Spanish Sorry, can hijack that first I was absolutely love that man Actually shout out to Al Bellamy in the house and only do you want to talk to us about the difference between Data scientist and Data analyst Love to hear what you have to say about that Um, I’m not sure if your mike is functioning or not Um, I don’t see my name on my screen Al Bellamy. All right Let’s hear from a live to hear from Dave and Al You are here. I am Yeah. I’ve been kind of ducking in and out, so I keep waiting for this Friday where I actually get off work at a normal time So I that’s my back door right there Yeah. So OK Yeah, I sort of helicoptered in here two minutes ago So I mean, I personally don’t have the kind of hard skills to to claim the title data scientist. If you if you told me to model something or predict something, I, you know, I could do some regressions and kind of basic predictions But if you told me to analyze something, then, OK, cool, I can do that You know, let me let me look at the data Let me do some media, you know, some basic stuff in Excel I’m working on some some better things I can combine stuff in SQL now, but yeah As far as a path forward for me to go from analysts to scientists, that’s what I’m still trying to figure out. But I’m new to even really thinking about that that path So I and people are in their dreams And thanks for it. Thank you for sharing that I would say that after I die, I forget my little thing here I’d love to hear from Dave or Tom on this, but I think the main difference is that a analyst analyzes, a scientist discovers Right. And I think that also teeters on the subtle difference between inference and prediction. I would say that analyst might spend more of their time on the inferential type of tasks, descriptive type of tasks, whereas a scientist would be more, I guess, predicting and forecasting and things like that, if that makes sense Dave, Tom, I hope I don’t intersect with Dave I’ll just chime in I love Eric Answer in the comment I think he’s getting to the heart of it And I want to point out we we are the historical figures of the Data And you’ll figure I mean, we’re just now seeing an explosion on something that’s going to radicalize what we’re doing, attention and transformers And the terminology is just going to be messy for a while And you know what a Data analyst does? What a data visualization That’s what a data scientist does That is going to overlap a lot for a while I find that the more I’m trying to do a good job, I stop and constructively criticize myself. I realize, you know what you need to be more like Kate, strike me in your pipeline work so you’ll do a better job so that you really understand the Data better before you just dove in and apply a model And so I think I just love it that we’re all still learning from one another But and I don’t mean this critically, but I kind of get irritated by a hyper concerned with titles and differences goals I know we’ve got to do it a little bit, but we’re just so early in the Data age, we’ve got to be flexible and learn on the fly and it’s just still going to be messy for a while. That’s all I’m saying But I really did like Eric answer I think he’s getting at the spirit of what we need to think about Eric, do you want to share your answer with with that podcast audience who cannot share? So I did just a little bit of research about it and found out that, like, physicians

didn’t act. We have specialties at all until like about two hundred years ago is when it finally started. And so as medical science has progressed and roles and titles have become super specialized because it used to be that, you know, all of your physician tools get pretty much fit in the bag that you would take wherever you were practicing your medicine. Right. But then things got more complex to trial and error and regulation There were a lot of medical schools just handing out degrees for basically no work You know, it was it was kind of a bonanza, right? I don’t know where else we might have ever seen that bonanza of any sort like that But anyway, so anyway, it’s not going to take two hundreds years to subdivide Data science, obviously, like we move at breakneck speed now But it’s helpful to me to keep that in mind that there’s no point in chasing after a title because right now it may be super valuable And then in like five years, it’s going to be like a dinosaur title because all these cooler, newer, you know, junior data unicorn titles will have come out that people are are way more intense. This isn’t the first time an industry has matured Unspecialized. Dave, let’s hear from you And then after Dave, let’s hear from Mikiko Yeah. So seven years ago, I would have said, if you’re not doing machine learning, you’re not a data scientist And I was wrong I was completely wrong because I just work at an insurance company and actuaries would be like, dude, we’ve been doing data science for decades, man We’ve been using data to drive business results And then the older people, the operations research people will be like, no, no, no, wait a sec. We started doing this in World War Two, man We were data scientist before the actuaries were data scientists So I wouldn’t worry too much about the title Persay I really like what Vin had to say, which is Target, what you want to do, grab the skills that you think are going to be useful to deliver business value using Data like in most of my content on LinkedIn, I typically use phraseology around this idea, like, are you a professional that wants to drive business results with Data? I don’t say, do you want to be a scientist? I don’t say that because maybe you work at H.R you work in supply chain these days I don’t think it really matters Focus on how I can drive business results with Data and then work back to the skills that you need. And that’s going to be all the basics It’s probably going to be SQL It’s going to be some sort of scripting language like our python It’s going to be some statistics Not nearly as much statistics as you think, by the way, in practice, generally speaking, not even close And it’s definitely not as much machine learning as you think, generally speaking as well. You can get away with very few simple and powerful techniques unless you’re in some sort of specialized area like self-driving cars or something like that So my advice that I give first is like take my opinion is one data point Go get other data points By the way, first thing second of all, do research Look at the companies, look at the rules that you want to do, cross-reference the descriptions of what you’re going to do, what kinds of technologies they list, and then that’ll give you some idea of what you should what you need to do in terms of skills and capabilities to actually make it to a data scientist title Or maybe you can just be a data savvy professional, maybe work in marketing, and you figure out a better mousetrap in marketing Awesome. Guess what? You’re going to get rewarded for it and you’re going to build up your portfolio and then that can take you anywhere you want to go Thank you very much, Dave By the way, everybody wants to see two thousand thirteen Dave with the short hair in the earrings, apparently. MIKIKO Yeah So, yeah So that’s just some background So I start off in like ESOPs growth stacking, like whatever the title was like in Silicon Valley at that time. Then I moved into like a Dallas roll titled Right And then I move to a title to a scientist role And then now I basically do everything under the sun except sign the vendor contracts for the staff that I’m at. Right. So the the lines there are a little bit more blurry Right. But I think like to kind of summarize everyone’s points Right. So the first off is historically design, as well as just the casual bucket when D.J. Patil wrote his piece and we’ve had some more roles that have special license And so I used to be this idea like the PhD, who is also the super computer engineer and all sorts of stuff. And they talk to people and not make and feel like idiots, you know, in the in the boardroom, because I was just kind of like the first like that was the be one. Right. And the V two is we saw some other worlds pop up It was like the Data ALISDAIR scientists And then I think the Data engineer I remember seeing like a bunch of blog posts were like those were the three. Right And it was basically like like strategy slash, like living with the business in the middle is kind of like still the research person who would kind of build their own models, do their own analysis I had like a white paper for the company or to a prototype or understand like, oh, this new algorithm, can it can we incorporate the product? And then you have the. An engineer who worked on the infra and I think now we’re kind of

going like the V three four, if you’re like some companies, let’s just skip the no go to like V8, right, where you’re seeing some more specialized roles like machinery engineer also or stuff. So, you know, one thing to remember is there’s like this historic trend, as Eric point out. The second reason why you see those titles and they sometimes are the same thing is because companies will use them as traps to try to get people into the role. They’ll say, like, well, we can’t offer you as much money, but we’ll offer you a title because we know we can sell it to you as they like if you have a design style and it’s easier to get recruiters in the future So there’s that And then I think the third point that people brought up is that sometimes hiring managers actually just don’t really have a good understanding of what those needs are And typically, you’ll see that in like an embedded sort of structure where it’s like a hiring manager on the business side is hiring for like a data scientist or Data Alice who lives with them. Right So you do have those kind of like three factors going on Now, that being said, right, like some companies like Facebook and Microsoft, they’re specialized enough where they have what are called decision scientists Right. So they have data analysts They have a data scientist or a research scientist, and they have a decision Just a day is still like living with the business partners, is still helping guide the business decision Scientists usually is more involved in the experimentation and inference aspect And then the research the scientist is is just doing research Right. So part of making that switch potentially is first off, looking at companies where you could make that tile change, just become more senior in your kind of existing role of companies, will be willing to negotiate that title If you you know, if they’re flexible enough, some companies like Google, Apple, they do have a ranking ladder So you can’t just negotiate a title You have to actually have a switch and responsibilities So that’s one method, right, is you you negotiate your own company the second way you do it sometimes it’s also just going to another company where maybe they’re doing the same work or something similar and they’re just calling it a data scientist title, you know? But I think those are sort of like shortcuts, right? I think you really want to as everyone point out, you want to identify the kind of work you want to be doing, whether it’s working with the business and all that, doing research, doing engineering, and then just trying to figure out, well, what is the next iteration in that sort of particular track? And I think that’s probably the best way to go My particular trajectory, I started off in the business side and growth as a growth maker moved over to analytics because at that point I had really enjoyed using data to help inform the business, move to do science, because I, I like the idea of focusing in on a problem, you know, researching it And now I’m moving over to the engineering side because I like the idea of building products, and that’s just kind of how it is So you can make the switch either through the title or through getting more senior in your responsibilities, moving to company But I would encourage just understanding, like what’s the kind of work you want to be doing, like, let’s say the next one into your schmiel and then sort of planning around that. Absolutely Love it. Thank you very much Mikiko, last call for questions If anybody has a question, just type it out into the chat I also just want to say, obviously ignored the titles When I was leaving grad school twenty thirteen, I was interviewing for roles that were called Predictive Modeler in the actuarial field And now that job title doesn’t exist But those are data scientist and Data nice work, which was just like random for us and in SAS. Tom, how are you doing Real quick shout out everybody If you quickly hold your hands like this and then start doing this for Harpreet Sahota Harp read your entire pattern of happy hours You’ve built an incredible community We praise the good job, brother This is awesome. Thank you so much We had forty people show up today I mean there’s so many people that I wanted to hear from that I just didn’t get a tattoo Matt Housley, thank you so much for being here and being so active in the chair, man Appreciate you. Everybody else who’s been so consistent showing up people here from day one. I started this thing 14 weeks ago, 14 weeks ago It was just like me and four people And then one week of in came and then one week three and came And then after that it was just like thirty people minimum It’s been awesome. Thank you guys for spending news with me Yeah, definitely. So why did you start this win Why? Why So like the podcast in general Just the open office hours Yeah, the office hours Yeah. So for Data AIs dream job do I do like multiple office hours a week, like, you know, six hours a week, pretty much out of office hours And that’s great if you can afford the program because it’s expensive And I figured how can I help more people by using a skill that I already have

During a time of day where I’m usually doing this already and, you know, I figure I would just do open office hours and invite people in and help them out somehow, and that’s kind of how it started. So Fridays for 430, most of my office hours are usually at four thirty four thirty or six p.m and so Fridays are open Had nothing to do. Obviously I can’t go to the pub anymore after work So let’s do office hours and then and hang out with people That’s kind of how it started And you know, just inject more of the Data science into the podcast because I know I’ve kind of shifted in a new direction where I’m interviewing just authors of books that I find amazing. And I’ve been fortunate enough to to convince people to come on to my show like like Lanying Robert Green Like, that’s mind boggling to me But more of a way to keep Data science That’s part of the podcast while I venture off and explore other areas for interviews and still keep that element, um, with the podcast Yeah, that’s Yeah. So thank you guys for joining Happy holidays Merry Christmas Happy New Year. For those of you that that join the festivities and wore your festive sweaters. Thank you so much for those of you that join me in a drink Cheers. I really appreciate that And if you’re the average of the five people you spend the most time with, I spend the most time with 25 to 30 amazing individuals every Friday You guys like the only people I hang out with nowadays So thank you so much for for being here and just helping me raise my average that much higher. Take Care will be back January 8th for the next happy hour And they’re happening every Friday after that, um, new stuff happening with the podcast next year. I’ve got some awesome, awesome guests coming on the show I’ve got some awesome podcast episodes recorded I’m really excited to to share all of this with you guys and hope you guys keep coming back. Hope you guys continued to show up and help support everybody Thank you so much, guys Take care. Have a happy holidays And remember, you got one life on this planet, so why not try to do something big, take care of everybody. Yeah