#182 – Data science drives personalized insights towards health goals | Helena Belloff & Sunny Negless
Episode introduction
Show Notes
Translating complex health data to valuable insights that inform better food and lifestyle decisions is the heart of what the data science team does at Levels. In this episode, Helena Belloff shares her day-to-day as a data scientist at Levels. Helena talks about how the data science team uses data collection, interpretation, and analysis to further the company’s mission of helping people achieve their health goals through the Levels program. Look for multiple new shows per week on A Whole New Level, where we have in-depth conversations about metabolic health and how the Levels startup team builds a wellness movement from the ground up in the health and wellness tech industry.
Episode Transcript
Helena Belloff (00:00:06):
One of the biggest jobs of a data scientist is to be the translators between data and people. Our users give us a ton of data and we have to figure out a way to make it make sense to help members achieve their goals, make it as personalized and customizable as possible. Member A might want to focus on weight loss. Member B might want to focus on longevity, and like Member C might want to focus on finding an optimal diet that allows them to avoid the 3:00 PM slump. So how do we communicate all this information in a way that is actually going to be useful for them? And how do we present it?
Ben Grynol (00:00:51):
I’m Ben Grynol, part of the early startup team here at Levels. We’re building tech that helps people to understand their metabolic health. And this is your front row seat to everything we do. This is a whole new level.
(00:01:17):
In most companies you never really know what different functions do. Maybe you know a little bit, if you’ve had some exposure to one, maybe you’ve got a friend that works in a certain area that’s different than your day to day work. Well, in the case of Levels that being a remote company, you have to be more intentional about sharing what each team or each function is working on at any given time. Well, there’s no better way to do that than to dig in with the team members themselves and start to ask pretty honest questions. What exactly does this team do every day? What exactly does this function do? How does it contribute to our mission and what we’re working on collectively as a team? And so Sunny part of our ops team and Helena part of our data science team, the two of them sat down and discussed the function of a data science.
(00:02:02):
How exactly does all this data that we aggregate about metabolic health, all these different health data points, all these different glucose data points, how does food affect people’s health? What do we do with this data? Where do we warehouse it? There are a lot of questions come up when you start to wonder what can we do with this data to actually help people? How can we present it to the world? And what are ways of thinking about this set moving forward? Anyway, it’s a great conversation where they’re able to dig in and talk through some of these questions. What exactly does data science out Levels mean? No need to wait. Here’s the conversation with Sunny and Helena.
Sunny Negless (00:02:42):
All right, we’re talking data science at Levels today. So talk to me. I’m going to lead it out with Helena, what do you do? I know, I know anyone’s listening to this could totally listen to your podcast from back in February. I looked it up this morning, it was episode 92. But let’s start, assuming no one’s listened to that.
(00:03:00):
What does a data scientist do specifically? What does a data scientist do at Levels?
Helena Belloff (00:03:04):
Yeah, so I feel like I’ve answered this question many times before. So often I feel like people confuse data analytics with data science. Analysis is a huge part of data science, but it’s actually quite an interdisciplinary field. And because it’s so interdisciplinary, it’s also inherently creative, which is sort of, at its core, what I love about it. But at its very, very core, we have all this data. We have behavioral data like food logs and exercise logs. We have heart rate and step data. We have biometric data like A1C and glucose levels and cholesterol. And we have these very, not vague, but overarching questions about what all of this data means when you join it together. And so that’s kind of where I come in.
(00:03:59):
But taking a step back as a data scientist at Levels, one of the first things that I looked at when I started here was data collection. So just the first step, how do we get data? And what data do we need? How are we going to answer these questions? Are we being intentional and actionable about the data we’re collecting? So that’s one of the first things. I work to help decide what data is actually relevant to our mission and useful for our members. And then to help ensure that we’re not just collecting data for the sake of collecting it. We want our members to trust that we’re being intentional to trust us with their data and to never feel like we’re using it for anything except furthering our mission and helping to meet people where they’re at, to quote our head of engineering, to meet people where they’re at and get them to where they want to go, help them achieve their goals. So that’s the first thing. Every bit of data we collect is done to support the health of our members metabolic research.
(00:05:07):
So that’s kind of the first thing is data collection. And that also includes other things like understanding app usage to help us improve our product. And then doing all this in a way that is anonymized and everything’s properly redacted and safe and secure. So that’s kind of the first thing.
Sunny Negless (00:05:32):
That’s a lot of things right there.
Helena Belloff (00:05:34):
That’s a lot of things. Like I said, it’s very interdisciplinary. So first step is what data do we need, how do we get it? Second step is like, okay, what format do we store it in? How do we clean it up and what’s the best way to store it? And how do all of these pipelines, how does it go from the app to our data warehouse? And what data do certain data users on the warehouse have access to? And what format is it in that’s usable for analysis, modeling, feature development, research, et cetera. So yes, that is a very, very large part of it, for sure.
Sunny Negless (00:06:21):
And I feel like I interrupted you because I feel like you were going to keep going like, Oh, there’s so more.
Helena Belloff (00:06:25):
I will.
Sunny Negless (00:06:26):
Okay. All right, then let’s put it all on a table. I’m almost thinking the pillars. I think about Cissy’s, most recent interviewers, the pillars of community. I’m thinking the pillars of data science. So far I’m hearing we got to bring the data in, we have to filter the data, We have to select for the data that’s going to be most useful in helping members see how food affects their health and how other lifestyle levers affect their health. But we primarily focus on how food affects their health. So bringing the data in what that does that look like. Filtering the data, anonymizing the data. So making sure it’s data privacy and safety is a core tenant. I’m like listing the pillars now as we have a pavilion at this point in time. So I got those guys.
(00:07:06):
And then I’m also hearing right now, and also app use. So it’s not just food, it’s not just logging what kind of exercise. It’s also like how are people interacting with the app so that we can be most effective? Which I think feels like that goes in with engineering. But there’s a unique side of it for data science. So I interrupted, but I already have four pillars.
Helena Belloff (00:07:26):
Yeah, yeah.
Sunny Negless (00:07:28):
Let’s keep going.
Helena Belloff (00:07:29):
And so a lot of that is working very closely with engineering to ensure that those pipelines are optimized and scalable and reliable. So that’s definitely a very huge part of it. And we want to ensure that those data streams are constantly flowing so that we don’t run into any problems and we’re collecting data in just a very organized, systematic way and storing it. So that’s all sort of the first steps. And then after that comes, okay, what do we do with that data? Now we have all this data. It’s in a suitable format for analysis. We make sure to store data in a way that is globally useful because we have a lot of different folks on the warehouse, for example. There’s support folks, there’s operations, there’s engineers, there’s data scientists, there’s product. And so we don’t want to just store data in a format that is going to be only suitable for a machine learning model that a data scientist is running.
(00:08:45):
We want it to be useful for everyone. So there’s that. And then it’s like, okay, well we collect a lot of data at Levels.
Sunny Negless (00:08:57):
The largest data set of non-diabetic glucose in the world, I believe. It’s massive.
Helena Belloff (00:09:03):
It’s massive. And so I’ll give you some numbers here. Yeah.
Sunny Negless (00:09:08):
Oh yes.
Helena Belloff (00:09:09):
Right now we have 3.4 million food logs. We have 172-ish million individual glucose data points. We have 11,000 subscribers and we have 25.6K members with glucose data. So it’s a lot of data and it’s growing, we are growing. So my sort of role after all of the data governance and privacy and cleaning and collecting and all of that, it can include anything from like, okay, I use this example often, can we predict a member’s glycemic variability to eating a peanut butter and jelly sandwich?
(00:10:06):
If they’ve eaten it a bunch of times can we predict what the shape of their glucose curve is going to look like when they eat it the sixth time? Or project to suggest relevant, healthier alternatives to foods that cause members to have large glycemic variability. And there’s a lot of stuff that goes into that. Feature implementation involves a lot of architecture to support it on the data side, we can spend all of this time designing and going back and forth with product, developing, implementing a feature, but if we don’t have the data to support it or the infrastructure in place to support it, it’s not going to work. So there is a lot of work beyond just algorithm development that has to be put in place to actually go from conception to live feature in the app, useful for people, giving people personalized insights. And that’s something that I think a lot about and that’s more of the internal project side.
(00:11:20):
How do we transform our data in some way that will be more useful for this feature or that will be globally useful for features that we might implement in the future? How do we streamline operations processes? How do we get support data into the warehouse so that support people can look at this data in the context of all of our app data and really gauge, do serious analytics. And basically, I think all of this is how do we optimize things? How do we streamline things? How do we automate things and how do we do that all in a way that helps people? I think that’s at its core. I’m rambling.
Sunny Negless (00:12:05):
No, it’s fantastic. I love this. So when we first met, you and I connected in April and I felt a little silly, but I was like, Wow, you could be a data scientist. It’s just something that just wasn’t really presented to me as a woman in science. And I just, my mind exploded about all the possibilities. You said something we’re basically just really curious geeks. And I loved that because I was like, there’s correlation, causation, there’s so many things that you can pull from. But what I’m hearing here, to bring it back to our conversation at the moment is it sounds like our focus, when I think about data science at Levels, is affecting the member experience through logging and then analysis there. But I’m already, I guess it’s not surprising, but it is to hear, oh, it’s also app usage. It’s also thinking about how as someone on the support team pulling in support data on what are people as far as features go, maybe been looking at with product of flagging features. People are excited about this or they like this, but wouldn’t it be great if we could do this?
(00:13:08):
So I’m hearing, we think with support, I’m thinking we’d pull that in with product, but then there’s definitely this data science element of like, well what do we have that we could already start to satisfy that need when it comes to developing product and features in the app. So yeah, interdisciplinary is definitely where it’s at. And that also I think is surprising is the team is quite small, we’re a startup. We are scrappy. How big is the data science team at Levels?
Helena Belloff (00:13:37):
So we are a team of three right now, which has been amazing because up until a few months ago, it was just me.
Sunny Negless (00:13:48):
Right, that’s a lot.
Helena Belloff (00:13:53):
Yeah. Like I said, we have a lot of data and we’re working with health data. Health data is at least in my experience, much more complex than data that you might pull from the stock market, for example. That data, it goes through similar pipelines. It’s not super messy. Our data comes in and it is very messy. It is just very unstructured. We have freeform logs where you can type in freeform text. We need to figure out how to clean that on the back end and make it usable and keep that cycle going of taking in data, being actionable about it and giving the user something back that is going to be useful for them. And it is nice to have more hands on deck when it comes to developing pipelines like that.
Sunny Negless (00:14:53):
For sure. So I want to put a quick pin in because I want to ask, I have the pleasure of hosting onboarding parties. So each week I get to work live with members of our community, usually folks who are pretty new to the app and I’m trying to get them either onboarded so they’re starting to log their first meals and activities or maybe they’ve had a little time with the data and then they always ask, the question I get all the time is, how am I supposed to log? So I want to preview that first question and come back to the way that is best for you as a data scientist versus what’s most useful for the member. So what would you, as the data scientist at Levels say, is the best way for a member to log their meals in the app?
Helena Belloff (00:15:34):
Yeah, so I guess I could answer that by just giving, just describing what we used to do to clean the log data versus what we do now and what format we sort translated into. So what we used to do, because we have freeform text, it can get very messy. For example, you can write peanut butter a million different ways is something that I’ve learned through looking at our data. You can write “pb”, you can write “peanutbutter”, one word, you could just totally misspell it, but you know that it’s peanut butter. You can write pb&j, you can do random capitalizations, there could be random punctuation in it and so-
Sunny Negless (00:16:22):
You want to capture it all because it’s all the same thing we’re looking at it’s just-
Helena Belloff (00:16:26):
It’s all peanut butter, right? And so any human reading that is going to know PB is peanut butter, but a computer is not going to know that unless you tell it. So basically what I used to do is I would take a bunch of freeform texts and I would go through a bunch of natural language processing algorithms to clean it up. And so basically NLP, it kind of tries to model the way humans think. So I know that “peanutbutter”, one word is peanut butter. How does a computer know that? We have to tell it. So we have to clean it up and that’s sort of what I try to do. I’ll get rid of punctuation, I will change everything to lower case. I will get rid of accents because that doesn’t give me any context into what this person ate jalapeno with the accent is still jalapeno.
(00:17:29):
If it doesn’t add context, we get rid of it and doing things like, it’s called limitizing, right? So if you have a word like “stopping”, for example, if you run it through this algorithm, it will change it to stop because stopping and stop, it’s the same. You’ve stopped, you’re stopping, stop. And so that’s essentially what we do. I would take potatoes and it would change to potato, just get the root of the word. I just care about what you ate. We’re not super focused right now on the quantity, although I’m sure that we will get there. But I want to know, okay, you ate potatoes, potato. And then I would also try to determine, okay, what ingredients make one ingredient, right? What is n equals one gram? What is n equals two gram look like? Peanut butter two, that’s two grams. So that’s one thing.
(00:18:41):
But peanut on its own is just a peanut.
Sunny Negless (00:18:44):
Peanut butter on its own is a very different thing.
Helena Belloff (00:18:48):
Yes. But when they’re next to each other, that’s peanut butter. So I’ll try to do stuff like that and I am very systematic and without getting too into the technical and all of this is very, very manual as you can tell. I’m going through actual in my head I’m like, okay, algorithm 1, 2, 3. It’s going through all these functions and it’s not perfect. I could end up with just random things that make no sense at the end of the day. So that is what we used to do. Now we have part of that system is automated. So we have our tags, which people always like to confuse. I think tagging should be only specific to the back end. I think members should only be concerned with logging.
(00:19:40):
Just type out your food. We have our auto suggest where when you start typing, it’ll come up, you can click on it that’s auto complete, it’ll fill it in for you and you hit save. And then as a member, you don’t worry about it. What happens on our backend is we will do part of that automated system where we’ll pull out tags. So if you typed “potatoes” that gets automatically tagged as “potato” on our backend. And so we know that this person logged potato and we can do a bunch of analytics on it and come up with very customizable insights based on that clean well structured data. And so that is sort of what we do now.
(00:20:26):
I don’t know that there’s a right way and a wrong way to log. I mean in an ideal world maybe that would be sort of integrated into the logging experience. But you have to think about what is most usable, what’s the user experience going to be like if we take away freeform logging? I mean I am very sarcastic in my logs. I’m from New York so I’m a pizza snob and I like to write logs like, “Very inferior Connecticut pizza”. And our backend will take care of the work of saying, okay, she ate pizza, we don’t care about this other stuff. Although good information to know.
Sunny Negless (00:21:12):
That’s hilarious and fantastic. And so helpful. I’m happy to say I think I’ve been furthering at least helping data science and the member experience what I say, kind of exactly the same thing of write it the way you would describe it to a friend or write it the way that when you go into your historic log, because that’s my favorite part of the app is going back and being able to look over time. You can search for eggs, you can search for omelet, you can search for hamburger, you can search for the things that are important for you that you’re curious about. Maybe you get a surprising result.
(00:21:46):
Maybe we hear this about sweet potatoes all the time where some folks expect one result and they get the exact opposite. One of my favorite ways to use the app is to use that historic data to go back, search for the foods, search for the logs and then see, you kind of overlay two graphs or so at a time. And you can see well how are different? How are they the same? And I think that’s where the magic is. That’s where we start to really get to see this objective data and go, okay, what can I do about this? How can I use this to craft some self experimentation?
(00:22:19):
So I say, you know don’t necessarily have to list every single ingredient if you have a summer Cobb salad, it doesn’t have to be arugula and this and that you don’t have to list every single thing, but write the things that are important to you. So if your coffee, the way that you do your coffee doesn’t really do much for your glucose, you’re not super worried about it, you just want to track it, hit coffee. If you discover that your morning coffee has one morning, you have a spike in one morning and you don’t, that might be something of, well what was different? Did I add allulose versus did I add sugar? Did I add almond milk versus dairy milk?
(00:22:57):
So then track the things that are important to you. Write it in natural speech. You mentioned we take care of the extraneous stuff. So if you want to write this brand almond milk, cool, we may not be looking for a brand at this point in time, we’re looking at almond milk. But that’s going to be helpful for you in your own analysis. Is that kind of, am I on board with that?
Helena Belloff (00:23:19):
Yes, yes, exactly. So a few months ago I did a member research project where I essentially went through certain members’ data with their permission obviously. And we would meet once a week and we would go over what are the insights for the week. So this one member, he couldn’t figure out, he was feeling very sluggish after lunch. He had that 3:00 PM sort of crash that the best of us get. And he figured it’s because I eat the sandwich every day at lunch and I can’t eat bread anymore and blah blah blah. And I actually went through his data and I saw, okay, you ate this sandwich on this day, you didn’t do very well, but then you ate this sandwich on this other day and you did better. What did you do here? And he was like, I actually went for a walk after. And he was like, “I actually remember that day I didn’t feel as bad as I usually feel.” And I was like, “There we go.”
(00:24:27):
And so another part of what I do actually is thinking through, okay, how do we make that process of me going through his data and sitting down with him and effectively communicating and saying, here’s a side by side of your graphs, here’s with a lock, here’s without a lock. How do we train a computer to do that entire process? So that’s the final sort of step is I think I’ve said this before, but one of the biggest jobs of a data scientist is to be the translators between data and people. Our users give us a ton of data and we have to figure out a way to make it make sense to help members achieve their goals, make it as personalized and customizable as possible.
(00:25:25):
Maybe Member A might want to focus on weight loss, Member B might want to focus on longevity and Member C might want to focus on finding an optimal diet that allows them to avoid the 3:00 PM slump. So after we do all of that fancy stuff on the back end, how do we communicate all this information in a way that is actually going to be useful for them and how do we present it? And so that’s where design and product come in as well. So yeah, there’s a lot that kind of goes into feature implementation. I think developing good algorithms by data science standards is much easier than integrating them seamlessly with the rest of the app. So we could follow best practices at our code level and it can be the result of hard work and many competent people. And then when we go to integrate them with the rest of our app, I mean at best case they’re just awkward. At worst, they are very problematic. They call zombie projects to happen. And that is something that I think we as a startup want to avoid.
Sunny Negless (00:26:50):
Wait, so tell me what a zombie project is.
Helena Belloff (00:26:54):
It just zombies on for all eternity. It’s just constantly sucking up resources. It’s expensive to the business, it’s a pain, everyone is wheel spinning and it just goes on and on and we just forget even why we wanted to do this in the first place. You start thinking, okay, who came up with this idea? And so that is something I think any new startup wants to avoid or just any worker in general. No one wants to be wheel spinning. And so I think a large part of what we’re doing collectively is rethinking how that entire process works and how can we avoid zombie-ing and get to that sort of at best, this might be awkward but it works, it’s scalable, it’s fitting in seamlessly with engineering, it’s fitting in seamlessly with product, it’s fitting in seamlessly with design and it is useful for the member.
(00:28:00):
And then that’s where that circular analytical pipeline comes back through, is the member actually interacting with us? And so yeah, that is the last prong I guess is we go from all the way to data collection to cleaning, storing, anonymizing, and then we go to analytics and algorithm development and engineering integration and finally to feature implementation and then all of that sort of cycle continues and enables us to make discoveries and generate insights and innovate in a way that we have never been able to before because we never had this data before.
Sunny Negless (00:28:49):
Oh my gosh. It’s so incredible, it’s so all encompassing. It touches every single piece of our business, piece of our product and it’s just under the radar. Your team of three is making all of this possible, all this happen. So I want to go back to that question before which was there are now three members of the data scientists team at Levels. Do you have specialties, do you have different foci or are you all kind of collectively working on the same project? I’m sure it’s a little bit of both, but what does everyone do?
Helena Belloff (00:29:23):
Yeah, so I think Toru actually came from a more machine learning background. I remember actually her teach us something, it was great. I was asking her all of these machine learning questions and she was just on it and my sort of background originated in also the machine learning, natural language processing space. I’m also coming from straight out of research. My most recent role was very heavy on computational genomics and I did a lot of Alzheimer’s research which has a direct link with diabetes. That’s how I ended up here. And I think Jason has more experience in the data strategy and the scaling of the data teams and he’s more heavily involved in defining our OKRs because now we have a little team, we have a pod going on.
Sunny Negless (00:30:36):
It’s not just Helena running data science at Levels, it’s now a team of three.
Helena Belloff (00:30:40):
It is not just me solving all these complex puzzles by myself and we’re still kind of figuring it out. There’s so much to do. So Toru for example, has been hopping on the data warehouse stuff recently because there’s so much to do with that.
Sunny Negless (00:31:02):
And when you say data warehouse, just for folks who are less familiar, what does that mean?
Helena Belloff (00:31:06):
Yeah, so our data warehouse, we use Snowflake. Essentially what we do is, so we have a Postgres database and that is where all of our data from the app gets stored. It gets stored in all these tables that our engineers built. And so what we used to do was I would query directly from Postgres alongside the engineers and there’s a lot of stuff in there it’s not really set up for data science or data analysis for engineering. And there wasn’t super, super clear documentation all the time because we have a very large code base and we have all these tables. Essentially what we do now is we take a copy of that Postgres database and we kick it up to AWS and we kick it back down into our warehouse and then we delete it out of AWS because we don’t want to just leave data in AWS.
Sunny Negless (00:32:09):
What is AWS? It sounds like it’s kind of intermediary storage.
Helena Belloff (00:32:12):
So yeah, it’s Amazon.
Sunny Negless (00:32:14):
Yes, yes.
Helena Belloff (00:32:15):
So it’s basically called an S3 bucket. We store it in there essentially actually. So Galit created the script that will custom do all of this. We used to use a connector, so it was kind of sat in the middle of Postgres and Snowflake called Fivetran. And the data would go through the connector and then it would go into the warehouse. And that’s actually how Help Scout data still gets imported through Fivetran. And because we have such a large amount of glucose data, we store a glucose point, we could store it every five minutes for every user at once. So that’s a very large table and it became clear that that sort of way of transporting data was not sustainable in the long term. So Galit created this custom script that would take bunch of CSVs, kicked them into AWS S3 bucket and then kicked them down into Snowflake, delete it out of S3 and then we’re left with the re sync data.And so that will occur every two hours. Another little prong is, so what used to happen is whenever the warehouse would re sync, so whenever it would re sync new data that was coming in, the sync would take five minutes. Now it takes like 25 because we have so much more data. But during those five minutes the warehouse would kind of get disrupted because it was rebuilding all of these tables. And so another sort of prong is like now we have a stage where essentially it will write data to that stage and then once it’s done it will swap with the core. And so there’s no downtime now, which is really cool.
(00:34:10):
So we have that copy. Only if you have an engineer role in the warehouse, you can see all of the raw tables, we call them raw sources. And what we do with those raw sources is I will come in and take some of the tables and I will clean them up and restructure them in a way that is useful for data analysis. And I will also document them heavily and properly redact them and then I will expose them to all of the data users in the warehouse. And that’s what rebuilds, so the sort of raw sources, that’s like our migration bit and then the exposed tables that all the data users use. And if you’re on the warehouse you’ll see people have created dashboards that just monitor all of our analytics in real time, which is really cool. And that is what rebuilds.
(00:35:06):
So what Toru has been helping me do recently is we have have our new e-commerce system and so she’s been helping expose or transform and clean and properly adapt those tables and expose them so that people can start getting insight from them, which has been a huge help because there’s a lot of tables to clean up and transform and expose.
Sunny Negless (00:35:33):
So that actually… One of my questions was like what’s a day in the life of Helena at Levels of the data scientists at Levels look like? And I also had a question of what would surprise your colleagues to know? And I think I kind of answered, maybe you started to answer my own question here. Regardless, I’m hearing there’s a lot, is there daily maintenance? Is there daily tasks you have to go in and do? So cleaning up and redacting. So with that, what does your day look like, Helena, what’s the day in the life of data scientists look like?
Helena Belloff (00:36:04):
Yeah, I mean, so it could look like just a day in the warehouse, which is always fun. It can be a little bit tedious, but I like puzzles and the warehouse is, I mean just even the flow I described that there’s a lot of moving parts going on there and if the system is very fragile. So if an engineer changes the schema of a table on our back end and let’s say they don’t because we have a custom script that will kick it to that S3 bucket and then kick it down. If they don’t change the stat script to mimic the changes that they did in the back end, everything could break. It’s a very fragile and we’re working on putting in more safe holds to ensure that not everything will break. Maybe just one table will have the problem and also automate the script changing thing because it’s very manual, but it’s also an incredible feat.
(00:37:09):
It’s amazing. So my day could look like debugging stuff because it’s very new, this sort of partnership and very close work between data science and engineering. And so there’s a lot of kinks to work out and it’s a very complex, fragile system. So it could look like working with Ian to get those e-commerce tables as part of the migration to tell that script, okay, take these new tables that we’ve added on the back end and add them to this script where we’re copying it over and sync it continuously. And then also if there’s a schema change with that, what the table that Toru is exposing could break depending on what is changed. And so it’s kind of following and squashing the errors that pop up. So that could be a large part of the day or it could be a very simple small part of the day.
(00:38:16):
It could be as simple as like, oh this column should be typecast as a character. Maybe a user ID or something is being saved as a number, as an actual digit and it shouldn’t be, maybe it needs to be a string. And so it could be just as simple as fixing that or it could involve working alongside product to improve a feature or brainstorm. I recently had a brainstorming session with David about, okay, if we want to integrate food insights into our Levels Levels or if we want to reward someone, for example, for eating chia seeds because we know chia seeds are good for you, they have fiber. How do we use our data to do that in this new Levels Levels feature that we are building?
(00:39:22):
And so it could be talking through, okay, what architecture do we need to have in place to do this? What are some analyses we like preliminary analyses we can do to decide what is most important here? Is it most important that chia seeds are just good or bad? Or is it most important that they’re a good source of fiber? Or what do we want to focus on and what do we want to pull out of our data? And then how are we going to do that? And then can we do a small analysis to prove, okay, this might be worth the engineering effort because here is a very small system and a good use case for doing that.So it could look like that. It could look, I don’t know, going through data, doing an analysis to see what are our most logged foods. That’s a question I get asked literally every day. Almost every day. People are like, I want to know what surprises you in our data. And I love doing that. It just always re-energizes me going through and just seeing all the things I recently discovered through doing some feature implementation analysis that one of the most common foods logged between the hours of midnight and 4:00 AM is pizza. And it made me so happy.
Sunny Negless (00:40:49):
But we don’t know if it was inferior Connecticut pizza or not, but I digress. It’s pizza.
Helena Belloff (00:40:54):
We don’t know that. But if you’re looking at my logs, you can get that information.
Sunny Negless (00:41:02):
Oh that’s so interesting. All right, so the day in life, as you might imagine of basically an interdisciplinary person is it can vary, so every single day is a little different?
Helena Belloff (00:41:12):
It can, yeah. I mean I could also, I’ve been working with Stadi a lot recently to set up our IRB research tables in the warehouse because we kind of need to keep that data separate. We need to know, okay, if this person signed the consent form, they’re part of the IRB, we can use their data as part of the IRB study. How do we identify that, okay, this is data we can use without obviously identifying the person. We’re very intentional about privacy and security. And so I’ve been working with him a lot to set up research tables in the warehouse that he can utilize as well. So it could look like that. I feel like I’m never bored. That’s, I think the moral of the story here.
Sunny Negless (00:42:01):
Absolutely for sure. There’s so much and I’m so glad that there are three of you now we need all.
Helena Belloff (00:42:06):
I keep working kind of picking Chris’s brain and just constantly sending him messages, being like, okay, what do you think about structuring a Help Scout table this way? What do you think about, what are the other variables that you want in here? Can we sort of customize and utilize Help Scout’s API to mimic some of the automatic analytics that they do in the warehouse so it could look like that as well.
Sunny Negless (00:42:37):
That’s right. That’s more of an internal that’s affecting my team, the support team because everyone is in the pursuit of basically how do we help folks? How do we folks improve their health? And every person on the team is focused on that mission. But seeing how, I guess I think about it as I’m doing a SQL search because I’m looking at data in our logs to understand what steps, on the customer service side of it, what steps did this person take to log in or more of how are folks interacting with the product on the end user side. We’re the first filter, so I’m thinking it’s a SQL search, it’s just a basic database search, but I don’t have the piece before all that and all those analytics before that. So yes, Chris does love a good Snowflake table and he’s so good at them.
Helena Belloff (00:43:28):
He’s so good. Chris’s dashboards are-
Sunny Negless (00:43:32):
A work of art.
Helena Belloff (00:43:33):
They’re industry standard, the Levels, industry standard of dashboard.
Sunny Negless (00:43:39):
It’s pretty fantastic. All right. So then I’d love to talk research, but the question that’s on my mind at the moment is what are you most excited about? I think about when I first met you when I joined, should I say when I joined in January and then when I met you in April, research hadn’t launched yet. We were doing the IRB, we weren’t doing the study quite yet. We were gearing up for it. And we talked a little bit about what you’re excited about. Well, a startup, gosh, what’s that been 10 months? That that’s basically five years in startup time, in real world time. There’s so much that has changed about the app, about the product, about the team. So what are you now most excited about?
Helena Belloff (00:44:19):
Oh, that’s so hard. I mean, I feel like it hasn’t changed. I’m most excited about personalized insights. I just think it’s so powerful and so helpful for people and to automate that process and make it so that I don’t have to go through blindly. Let’s say I’m not the most inquisitive user. I just want to be told, do this for a better outcome or do this you’ll feel better and here’s all the reasons why. Here’s evidence and here’s more information if you want to delve further. If not, you’ll learn something through this very small tidbit. And so I’m most excited about just automating that process of keeping people engaged with their own health, empowering them to take control of their own health and then keeping them engaged and helping them reach their goals and understanding.
(00:45:27):
I mean, I knew nothing about nutrition at all before I joined Levels. And now I’m like, Oh, I should eat this before that, I should have a handful of almonds before I have all of these strawberries because I feel actually terrible when I don’t do that. And oh, I should have the veggies at the table first and then my main dish protein and then I’ll have the slice of bread at the table and then I’ll have dessert and maybe I ruin all of that progression, but I know these things now. I’m aware of it in a way that I was never aware before. I felt like I was asleep before just eating things and not even wondering why I felt bad.
Sunny Negless (00:46:15):
No, I absolutely adore that. And I don’t know if you know this, but outside of Levels, I’m a nutritionist, so I worked with people for years. I worked with people for years about basically developing interoception, how to understand we understand this 3:00 PM slump after, well actually back up, the 3:00 PM slump becomes this common thing. I would say it’s common. It’s not normal. It’s not normal for humans to feel that way, but it is so common.
(00:46:41):
I remember the mass exodus from my previous job that between two and four, everyone was going to the coffee place to get a quad shot latte and some kind of bar or snack. So we all kind of have this inherent understanding like, oh yeah, that’s just kind of the way it is. And most of our members are somewhere in that journey of, we used to have a lot of folks who were really, really into the biohackers and the folks that listen to everything and really deep and now we’re moving into what we were excited to be moving into kind of more general population of folks who I’d say, when I talk with my members on the onboarding calls, this is no one’s first rodeo.
(00:47:21):
We may get there, we may get to the part that hopefully the next generation of folks are the friends and the relatives of folks who are maybe a little more interested in this. But now we’re going to give it to folks who someone they know, they love, they care about has recommended this has helped me, it will help you. So we are growing and expanding who we’re reaching. But as a nutritionist, going back to what I was originally saying of we would do weeks and weeks of working with someone to get them to go, maybe make a tweak to your afternoon. Is it what you’re having in the few hours before the 3:00 PM slunk? Is it your sleep? Is it your stress? So much iteration and effective takes a long time. It takes a long time to get there and to develop that. So something I was most excited about being able to get this real time personalized objective data is it does help to speed up that process.
(00:48:16):
And I see that that’s one of the magic pieces of Levels is getting to that, those personalized insights of we can help, I’d say almost solidify, it’s faster and help solidify some of our own kind of assumptions. We may even assume it’s the caffeine. Maybe I had too much caffeine and now I’m crashing. So we can actually take a look at the data and go, well actually when we look at this, we’re seeing these trends. Try this. And one other piece I want to add, I don’t want to get on the soapbox here, but health information and nutrition information in particular. There’s so much out there, there’s so much out there. It’s often contradictory. And there’s also, sometimes I hear from folks, from clients, a distrust of, “Right, but everyone says this, it doesn’t work for me.” And then you start questioning, well then nothing’s going to work for me.
(00:49:12):
So I see data, what I think it excites me about Levels in the product is those more pinpoint, you mentioned being able to comb through your data and give you, it’s never going to be have three, well, I don’t think it’s never going to be have this specific amount of macadamia nuts. But I love that where your folks have different proclivities, they have different desires as well. Some folks really enjoy like you, like sorting through data. I like puzzles. I like being really curious. Oh, it’s so interesting when I do this, the result is this. And other folks are like, I’m not. That doesn’t excite me. I really want to just be told what will be most effective. I want to be a very efficient, I want to be very efficient with my time and my energy, my money, my investments. I’m putting a lot of time into logging meals and things like that.
(00:50:06):
Or I’m putting what feels like a lot of time to me. I don’t want to sit there and have to figure it out on my own. I want someone to just tell me that that’s almost like having an assistant or having an EA. Having someone who can help you make sense of that information so you can get back to the other important things you’re doing for the day. So I want to hold space for both. One is not inferior or superior. We are all different humans. So having the ability for someone to dig in and get really curious versus someone who is like, I want to optimize and I want to do this quickly. So I think that’s very exciting that we have space for both and that data science can help both individuals reach that.
Helena Belloff (00:50:41):
Yeah, I love that too. And that also just brings me back to when I first joined Levels, Steph and I got an Airbnb together in San Diego. We just decided, we were both nomading at the time and we were like, let’s just do it. We both started on the same day. It could have gone horribly, but it went really well. And I remember we were both sort getting into this the same time of, oh wow, I logged this thing and now I’m spiking, okay, what’s in this thing? And we would check, So when we would do our weekly supermarket runs, we would just spend three hour, a ridiculous amount of time in the supermarket checking nutrition labels and looking at the ingredients and what is in these things. And I had discovered things. I actually, I knew, okay, yeah, I guess there’s sugar in Heinz ketchup, there’s corn syrup in Heinz ketchup. I didn’t know that.
(00:51:48):
And just checking even FiberOne bars for example, they have so much sugar and people think, oh, they’re a really good source of fiber, but they spiked me. And I’m like, they have a lot of sugar in them. And so that’s sort of how my interest was peaked. But no one has time to just spend three hours, I mean we had to at a certain point be like, we need to go home. But I think that’s also what really excites me about this whole push towards food insights and personalized insights of if you want to know that information, I would love for us to be able to give that to you.
(00:52:32):
And I mean, my mind is just going to run wild. And not even saying this is even remotely in our pipeline right now, but barcode scanning in the app and it’ll like tell you all of the things you want to know about that ingredient or coming up with shopping lists that are balanced and you don’t have to spend all this time checking nutrition labels. And now we’re getting into recipes, which is so exciting and you can learn a lot that way.
(00:53:02):
And so I just love, there’s so many different avenues of engagement and like you said, there’s different Levels of engagement too. I think there’s just infinite possibilities here. And I love just being a part of keeping people engaged in whatever customizable personalized way works for them and helping them learn and get the most out of Levels because I think it’s an incredibly powerful product and we just have the most unique opportunity to help people.
Sunny Negless (00:53:37):
That’s the mic drop there.
Helena Belloff (00:53:39):
Yeah, that’s exciting. I mean that is why I joined Levels. Like I said, I’m never bored. This is the coolest data set I’ve ever worked with personally.
Sunny Negless (00:53:50):
Yeah, exciting stuff. I just want to go into the research piece of it, getting excited about what we’re trying to accomplish with the research. I think it’s 50,000 individuals is our goal, to enroll 50,000 individuals for tracking their glucose. Talk to me a little bit about how that’s impacted your work, what you’re doing, maybe how it’s different. I mean, we could talk about all kinds of things, but how is it different than the data set we have for folks not enrolled in the IRB or anything you want to talk about IRB and data science?
Helena Belloff (00:54:22):
Yeah. So the research side of it is actually really interesting to me because like I said, I’m coming straight out of research and I’ve said this on other podcasts. So I used to do Alzheimer’s research, which is something I never thought I would do actually. And I got super into it and my thesis in grad school was looking at, so with Alzheimer’s you’ll have basically two distinct types of neuropathological lesions. One is they’re called a beta plaques where basically this protein will start or this gene will start pumping out all of these beta proteins and it builds up and it clumps together to form these plaques. Another one are these neurofibrillary tangles, which is the tau protein. It will again keep building up and it will actually get tangled and it’ll twist onto itself. And so that just disrupts neuronal function and can lead to things like memory loss and it has a lot of cognitive impact.
(00:55:35):
And so I was trying to draw connection between RNA sequence data and protein expression data. And we have no idea why all of a sudden these genes are like, let’s just keep pumping out these proteins and why this sort of malfunction occurs. And then also in a portion of people who showed no cognitive impairment while they were alive, they find these plaques. So this disruption can start happening, I don’t know, decades before you show any sort of Alzheimer’s symptoms. And one thing I just kept hitting a wall with my research and I was trying to use all of these polygenic risk scores to predict the odds of someone getting Alzheimer’s later in life. And I was taking all of these different genetic wide association studies and from different diseases, like diabetes, obesity, even depression, and trying to create a machine learning model that would predict it.
(00:56:42):
And I was getting horrible accuracy. I was getting 30% accuracy, which that’s not helpful. I think the highest I got to was 50%. And then I realized, I actually, I had one conversation with Sam, this is what happened, where he was like, and I’ve said this before, Alzheimer’s is type three diabetes. It costs us insulin resistance in your brain. And I was like, Wow. I actually knew that diabetes was directly linked to Alzheimer’s, but I was just so focused on the genetics and I was in computational genomics land and I was just so why can’t I make this connection? Why can’t I just look at all this complex upregulated downregulated gene protein pair analysis, I’m not seeing anything super crazy. I definitely saw some really cool stuff, but a huge part of that. And I think a huge part of why I was getting 50% accuracy because the other 50% is lifestyle and has to do with diet, even environment in the womb where you live, how you live, how much sleep you get, stress is a huge one too.
(00:57:55):
And so all of this stuff and Levels is the missing puzzle piece to my research. And so that’s why research holds, especially at Levels is very near and dear to me because it is the other world that I think has yet to be mainstream integrated into hardcore computational genomics research. And so I’m super excited by what we’re going to discover in our IRB and just the different use cases it has and the conversations that we’re going to spark and yeah, I can’t wait for that.
Sunny Negless (00:58:41):
Yeah, contributing. I see this that you were in this previous area of medicine and medical research that was dealing with something really, truly horrible. Alzheimer’s is a just absolutely horrible disease. We’ve recognized and it’s taken some time for the mainstream to kind of catch up. I love being a part of a company that’s furthering that information of this is essentially type three diabetes. And so I could see you almost like taking a step back, like a literal step back. There’s a zoom out and you’re working on an important function of it, but you’re going, there is something missing here. This is your master’s thesis. There is something missing. And then to say, now I have the opportunity to almost like a ladder, I get to take a step back down the chain and I get to affect or impact, potentially impact our knowledge of what then what happens downstream.
(00:59:36):
I just sat there and listened to your passion. You’re definitely passionate about data science. I can definitely tell that. And it’s exciting and it’s energizing and it’s interesting. Listening to you talk about Alzheimer research and understanding we really need to go one step further. We can prevent this, we can push this further out, hopefully prevent it, but we can push this further out. We can lessen the impact, we can lessen the number of people impacted if we can take a step into the lifestyle. So this research that we’re doing, I see you just again, light up knowing that this is, you, right at the start of the conversation. I think this is the, please correct me if I’m wrong, I think this is the first large scale look at CGM usage and really it’s showing the impact. It’s showing the result of your lifestyle behaviors to the people who are being studied to the participants, rather.
(01:00:29):
We’re showing you in real time, you get the opportunity to make these adjustments. This is the first time we’re looking at a large data set in non-diabetic individuals. So I say the first thinking, this is the first one, there will be more because we’re now going to use this entire idea, this maybe not even data set, but this idea of studying folks without metabolic disorder. And I could see longitudinal studies, long term studies, what’s the impact here? So anyway, I see your passion.
Helena Belloff (01:00:59):
Yes. Oh my god. Yeah. Even the case with Alzheimer’s, like there have been cases of people having cognitive impairment and none of these neuropathological lesions. And so listening to some of our advisors, every time I hear Rob Lustig speak, I’m like ear to the headphone. I’m listening. Yes, I agree everything you’re saying. And I think there’s just been this huge gap in research for so long. And I always knew diabetes is linked to Alzheimer’s, but I was so focused on, okay, can we cure it? What about therapeutics? What about diagnosis? What about all this stuff? But what about prevention? I think if we can shift and spark wide scale conversation, I mean even for people who can’t afford CGMs, can we help people and make our app useful for people just going to the supermarket and reading these labels and what sort of long term impact does that have and how can we measure it? And I think we’re going to be at the sort of center of that conversation. And that’s so exciting.
Sunny Negless (01:02:24):
That’s true. I think I have a data science and content, a big part of what we do is content. We understand that these are not necessarily accessible for everyone. It is a high price point. We’re trying to drive it down, but in the short term, we’re driving the conversation. So not only, like I mentioned, we are driving forth the conversation and making it more mainstream and accepted that the guess, Alzheimer’s and diabetes are connected. Yes, there is potentially preventative measures we can, we can do this 10, 15 years in [inaudible 01:02:51] than that for our longevity and for not just longevity but health span. So we are also using, even if you never use Levels, even if you never place a CGM on your arm or use the app, we’re hoping to get this content and continue this conversation in not just our bubble, not just our world and our field, but we want to impact a billion people.
(01:03:16):
That’s a big way we’re going to do that, is using that incredible data science. That information that we get from the research, the paper, the research is not going to be solely to the folks on the app. It’s going to be for the entire community, the entire world to see. So you like as data scientists, again, I can’t come back to this team of three are leading this huge impact that’s touching every single area. So content, research to impact folks who will never even potentially even hear of Levels. Though if we do our podcast well, everyone’s going to know about Levels.
Helena Belloff (01:03:49):
Yeah, I think we are. It’s such a vast field. It’s so interdisciplinary. It’s so creative. There’s endless possibilities. And I love it. And I love this data and I love this mission.
It’s amazing. So my day could look like debugging stuff because it’s very new, this sort of partnership and very close work between data science and engineering. And so there’s a lot of kinks to work out and it’s a very complex, fragile system. So it could look like working with Ian to get those e-commerce tables as part of the migration to tell that script, okay, take these new tables that we’ve added on the back end and add them to this script where we’re copying it over and sync it continuously. And then also if there’s a schema change with that, what the table that Toru is exposing could break depending on what is changed. And so it’s kind of following and squashing the errors that pop up. So that could be a large part of the day or it could be a very simple small part of the day.