Alternative data has exploded in the last few years. What was once used only used by quant hedge funds is now a valuable asset to financial firms around the world. The question is no longer whether or not alternative data is useful — it's how fast it can be implemented. One new strategy to lower costs and speed up results is rapid prototyping, and UBS is one of the investment banks using it.
Norman Niemer is the Chief Data Scientist within UBS Asset Management's Quant & Data Science (QED) team, where he delivers data-driven investment insights. Niemer is also responsible for both the business as well as technical aspects of multiple innovation projects that augment human decision makers with machine intelligence. Outside of investment management, he builds open source libraries that accelerate data science and has led teams that built AI-based products that won several high profile hackathons.
Last week, Niemer spoke as part of a Thinknum Alternative Data webinar series hosted by Thinknum’s chief growth officer, Marta Lopata. In the webinar, Niemer discussed how rapid prototyping increases data ROI faster than previous methods. Rapid prototyping, according to him, works in four steps: Target users are identified, a dataset and hypothesis is created, and building can begin. Once there's a prototype, data scientists share it with users while iterating on future versions.
Watch a replay of the webinar here.
B2: Is there a specific number of data sets you test against the hypothesis until you deem it a failed experiment due to a failure to reject no hypothesis? What's your cutoff for P-value?
Niemer: That is a very statistical way of thinking about how to test the hypothesis. That's typically not necessarily how I go through this. Skipping back a couple slides, a lot of the success measures I put here are not necessarily something that you can test in a statistical setting, especially since the Thinknum data set has got five years worth of history. But if I'm thinking like, "Okay, how can I track openings around COVID?" then you might only have two true data points.
I'm not really comfortable running a statistical test on it. I mean, you definitely can, but I don't think it's going to tell you anything particularly insightful. So that's not a great answer to that question, but I'm saying I typically don't run these types of P-tests for deeming whether an experiment was useful or not.
Obviously if you work on something, we've got a larger test set and then you can do that. I mean, you've got the normal rules around hypothesis testing that you need at least 30 data points to make that a useful test. Maybe to make it work with 10 or something on a T-test, or you need to do some type of patient statistics, but again, I'm thinking about it less from a statistical perspective and a little bit more from a business perspective.
"There's no model in the world that can only fit a bunch of data points with one actual data point. You just have to understand where the data comes from and how it behaves and then you just have to run with it" - Niemer
What database software do you tend to use for your projects?
We don't have a big data type of thing. Maybe we should, I think you can get a long way with just pandas 60l flow. Everything gets stored in parquet files so that just sits somewhere otherwise it gets onboarded into a SQL database and then people can locally work with whatever they need. So that's what we use. And there's corner cases if you work with something really big, might be pyspark and desk and all those types of commonly known data sets. But it's a lot of the open source stack that you would probably be familiar with.
What are you doing with alternative data outside of the US? We hear a lot these days about coverage outside, like in Europe and other places. What's your take on that?
Yeah. It is a challenge. I'm sure the person was asking like the Eagle Alphas and Neudata. And obviously Thinknum has data sets that are global. I assume job listings are global. Sometimes it just works better in the US because there's more stuff that's being recorded, but it is a challenge. But there's things out there so with the Thinknum guys and the Eagle Alphas and Neudatas, they do spend a lot of time because a lot of people have their issues, so they've been spending a lot of time and resources to make that better but Europe's always a challenge with GDPR. Sometimes you just have to be creative and it's just a constraint that you have to work around.
We run into that too, because we've got a very global investor base. But sometimes you can help them in different ways, not just with alternative data but other things that you can enhance as part of the investment process and still make them happy, even though unfortunately it's not as much as all data as you would like it to be.
How are you treating lower frequency data? What's your take on that?
I briefly mentioned that already with the hypothesis testing example and I always joke that I have small data problems with big data. Because somebody is in the process, be it Thinknum or maybe some other vendor, has already taken all that stuff that was really big at some point and curated it for you. And then you're trying to correlate it with low-frequency, quarterly or a half year company KPIs. That is a challenge obviously even with data sets that have been around for five years or so, you might still only have like 10 data points. I think it's not just, "Hey, can you predict some company KPI?" But can you leverage the data set that gives them insights that they would typically get in some different ways?
A lot of people do channel checks, for example. I think Thinknum has some software that reviews data sets that might not necessarily correlate, and neither do the surveys. They just give you some insight into how things are doing. Sometimes it's about like, "Hey, I need granular insights into something that the company has provided." So it's not as much that you necessarily want to forecast the company KPIs, but you might want to get really granular insights into what's happening at the company level that you can't get from the published accounts. You're just going to have to get creative. But yeah, it's a challenge.
We also thought about applying some Asian techniques, but there's only so much you can do with it. There's no model in the world that can only fit a bunch of data points with one actual data point, and you'll see that with COVID. You just have to understand where the data comes from and how it behaves and then you just have to run with it.
"It's better to have a couple of use cases and then that helps you a lot with how you onboard the dataset, and how it should be playing in a database or what the cleaning process needs to be" - Niemer
I have a few questions here around the onboarding part of the process. How do you organize your data science team to test those new data sets at speed and at scale?
We don't have an official onboarding team. I always like to get something from the vendor, maybe they can give you a slice of the data, because they know the data the best. You say, "Hey guys, I think now I'd love to replicate the Disney case study." Maybe they can give you a slice of the data, so you can have something that's clean to work with without having to go through that whole onboarding process. And then as you do more and more of these experiments, it becomes clear how you should best onboard the data set as opposed to onboarding the data set and then thinking about use cases. It's better to have a couple of use cases and then that helps you a lot with how you onboard the dataset, and how it should be playing in a database or what the cleaning process needs to be in and whatnot. That's typically done by us in a relatively decentralized way. Typically the people that work with the data do that part of the process. And then over time as we need to automate things, we'll maybe productionize that a little bit more, but we do that a lot later in the process.
The other thing that I use is the bits that the FISD Alternative Data Council put together. So that's the tear sheet, the vendor tear sheet, like the due diligence questionnaires, and then as we're working on the data packaging standards, that should shorten that onboarding process.
How do you treat different vendors when each vendor has a different standard for mapping, how does that work? If you are testing multiple data sets in this rapid prototyping process that can cause some headaches, right?
For sure. Yeah. This is why I'm working with the FISD Alternative Data Council to really shorten that. Ticker mapping is a big issue, but with prototyping, I would just take whatever I can quickly map. I might only be able to map half the dataset real quick or maybe it's only 10 companies or maybe it's only one company. Before I worry about how to map the whole thing, let me just work with one company that I can just map by hand. Maybe the name is Disney and I know what the ticker is like, bam, mapping done. Over time you do that more and more, but at least you have a good reason to do the mapping, which is a pain, but at least you know you're going to be able to generate business value before you go through that painful mapping exercise.
If you were to characterize a data set that's perfect for rapid prototyping and free points, what would those points be?
So here's my dream wishlist. It's very easily accessible and ingestible, it sits in an S3 bucket. I think Thinknum actually has an API so that's pretty nice. Actually you can mold it directly into Pinners. Or give me Parquet files. That's a good one. Then there's a lot of good documentation around it. That's another big problem where you don't really know what you're looking at. You get a lot of back and forth with a vendor.
The third one is replicable case studies. I don't think any vendor does this, but that's part of the work that I think we as an industry should be heading towards. I feel like, "Hey, Thinknum you guys have written this great Disney case study. Why don't you guys give me your Python codes?" So instead of me having to recreate the wheel, I'd love to just be like, "Okay, great. Here's a Python script. I can run it. And we created the graphs that you guys got on your website." So that's my Christmas wish list. And like I said, no vendor is at that point. So it's like, "How close to that ideal point can you get?"
About the Data:
Thinknum tracks companies using the information they post online, jobs, social and web traffic, product sales, and app ratings, and creates data sets that measure factors like hiring, revenue, and foot traffic. Data sets may not be fully comprehensive (they only account for what is available on the web), but they can be used to gauge performance factors like staffing and sales.