Projects

What I learned from my side project in education technology: Formata

formata_screenshot

Screenshots of what the student would see, taken from the deck I sent to teachers.

Last winter, I built an MVP for an ed-tech product, called Formata. Here’s what it was, why I did it, and what I learned from it.

Why Education

I had been (and still am) trying little side projects in different industries because I like learning about and understanding new things. At the time, I had done some stuff in productivity and fintech, and I knew I wanted to have an impact on education eventually in my life. It’s been so influential on me and and is a huge lever to get us closer to what I call “opportunity equality” worldwide, so I decided to do a small project in education this time.

Principles of Educational Impact

I did a little thought experiment: I imagined myself as a middle school kid again, and thought about what influenced me the most, in my education. “My teachers” was the answer. Students spend the majority of their week day in school, and it’s the teachers that interact with them, and understand each and every child. I saw it first hand on a farm on the other side of the world: way more than the facilities and the curriculum, it’s the teacher that inspires the student and really has an impact on him or her.

Next, I asked, “Ok, so if teachers have the most impact on a child’s education, what makes a good teacher? What does “good” even mean? And how do you measure it?” I did some research, and came across the Gates Foundation’s Measures of Effective Teaching project, a project backed by hundreds of millions of dollars and pursuing these exact questions. Awesome!

Some more research led me to the interesting and sometimes controversial world of teacher evaluation. Traditionally, teachers have been evaluated by two methods: student test scores (also known as “value added”), and observations by someone like the principal. The thought is basically that student test scores, as the outcome of a teacher’s teaching, should correlate with his or her teaching ability. Sometimes, administration has a rubric for what they think makes a teacher good, and so a few times a year, the principal might sit in on a class for 15 or so minutes to observe and evaluate the teacher.

There are some fundamental issues with both methods, which I’ll mention briefly. It’s hard to see the principal observing each teacher a few times a year, for 15 minutes, having any strong relationship with how good the teacher actually is. The Gates Foundation has done research that shows that teacher observations are less reliable than test scores; however, tests on which teachers are usually evaluated (usually state-wide standardized ones) only happen once every year, and if they know this is tied to their employment, there’s a strong incentive to “teach to the test”.

Who interacts with teachers the most? Who would be best at evaluating them? The students themselves. Again, the Gates Foundation did a bunch of research on what exactly students should evaluate teachers on, sort of quantifying the aspects of a good teacher. They narrowed the most important characteristics down to what they called the “7 C’s”: caring, control, captivate, clarify, confer, consolidate, and challenge. Structured in the right way (e.g. low-stakes and anonymized, so the students aren’t incentivized to fudge), student perception questionnaires that asked about these characteristics were pretty reliable in discerning high performing teachers from the rest.

Building A Product

I noticed that in the Gates Foundation’s research, the student perception surveys were being administered with pen, paper, envelopes, stickers, etc. I felt like the surveys could be administered much more efficiently with technology; the results could also be tabulated and organized much better for teachers and administrators to learn from.

To further validate my idea, I went to a bunch of ed-tech meet-ups, talking to teachers and asking them what they thought about my idea. They all agreed that having more feedback, more frequently, on their teaching would be helpful.

I thought this was a pretty quick MVP to build, I could even do some of the analysis of feedback for the teachers manually myself at first. All the teacher would have to do was give me the email addresses of his/her students, and I could auto-generate emails and questionnaires, send them off, and aggregate the results.

Visualizations of student feedback I could generate for teachers, so they could pinpoint where to work on

Visualizations of student feedback I could generate for teachers, so they could pinpoint where to work on

Moving On

After a month of reaching out to teachers, those who I already met or knew and also those who I didn’t, and sending them my slide deck about Formata and its benefits, I finally got a few who said they were willing to try it. They were extremely busy though (all teachers are overworked), and had to get permission from their department heads, who had to get permission from the principal, to use it. Their effort fizzled out, and I did a re-evaluation of my own time, and moved on.

What I Learned

I learned about a lot of different things, but overall, I think this project reinforced two principles for me:

  • Ask better questions when doing customer development, and solve a problem.
    • My idea never really solved an important problem for my target audience, teachers. I should’ve talked to more administrators, who may care more about teacher evaluation. Also, you’re bound to get positive but not very useful answers when you ask someone what they think about your idea: whether it solves a big enough problem for them to actually integrate your product into their life is a different story. Not solving an important enough problem for teachers coupled with lots of bureaucracy and the fact that they’re overworked was not a recipe for excited users.
  • Keep doing things, don’t worry about failure.
    • I got to learn about an important and fascinating area of education by doing this project. I also got to learn about the realities of the space. I learned more about the power of customer development: that through observation and/or asking better questions, you can get to true pain points that people will pay you to solve. I learned that some types of problems and tasks excite me more than others. This project was also a great way for me to practice first principles thinking.

Thanks for reading this journal of sorts.

Standard
Projects

Cancer clinical trials and the problem of low patient accrual

Inspired by this contest to come up with ideas to increase the low amount of patient accrual for cancer clinical trials, I decided to look more into the data. Bold, by the way, is one of my all time favorite books, and was co-authored by the creator of the herox.com website, the xprize Foundation, and co-founder of Planetary Resources: Peter Diamandis. Truly someone to look up to.

Anyways, the premise of the contest is that over 20% of cancer clinical trials don’t complete, so the time and effort spent is wasted. The most common reason for this termination is the clinical trial not being able to recruit enough patients. Just how common is the low accrual reason though? And are there obvious characteristics of clinical trials that can help us better predict which ones will complete successfully, and what does that suggest about building better clinical trial protocols? I saw this as an opportunity to explore an interesting topic, while playing around with the trove of data at clinicaltrials.gov and various data analysis python libraries: seaborn for graphing, scikit-learn for machine learning, and the trusty pandas for data wrangling.

Basic data characteristics

I pulled the trials for a handful of the cancers with the most clinical trials (completed, terminated, and in progress), got around 27,000 trials, and observed the following:

  • close to 60% of the studies are based in the US*
location_distribution

*where a clinical trial is “based” can mean where the principal investigator (the researcher who’s running the clinical trial) is based. clinicaltrials.gov doesn’t give the country in which the principal investigator’s institution is in, so as a proxy, I used the country which had the largest number of hospitals the study could recruit patients at.

  • almost 25% of all US based trials ever (finished and in progress) are still recruiting patients

overall_status_distribution

  • of those trials that are finished and have results, close to 20% terminated early, and 80% completed successfully (which matches the numbers the contest cited)

finished_status_distribution

  • almost 50% of all US based trials are in Phase II, almost 25% are in Phase I

phase_distribution

  • and interestingly, the termination rate does not differ very significantly across studies in different phases

status_by_phase

Termination reasons

Next, I was interested in finding out just how common insufficient patient accrual was as a trial termination reason vs. others reasons. This was a little tricky, as clinicaltrials.gov gives principal investigators a free-form text field to enter their termination reason. So “insufficient patient accrual” could be described as “Study closed by PI due to lower than expected accrual” or “The study was stopped due to lack of enrollment”. So I used k-means clustering (after term frequency-inverse document frequency feature extraction) of the termination reasons to find groups of reasons that meant similar things, and then manually de-duped the groups (e.g. combining the “lack of enrollment” and “low accrual” groups into the same group because they meant the same thing).

I found that about 52% of terminated clinical trials end because of insufficient patient accrual. This implies that about 10% of clinical trials that end (either successfully, or because they’re terminated early) do so because they can’t recruit enough patients for the study.

termination_reasons

Predicting clinical trial termination?

Clinicaltrials.gov provides a bunch of information on each clinical trial–trial description, recruitment locations, eligibility criteria, phase, sponsor type (industry, institutional, other) to name a few–which begs the question: can this information be used to predict whether a trial will terminate early, specifically because of low patient? Are there visible aspects of a clinical trial that are related to a higher or lower probability that it fails to recruit enough patients? One might think that the complexity of trial eligibility criteria and the number of hospitals from which the trial can recruit from could be related to sufficient patient accrual.

Here was my attempt to get at a solution to this question analytically: fitting/training a logit regression multi class classifier–whether a trial would be “completed”, “terminated because of insufficient accrual”, or “terminated for other reasons”–on a random partition of clinical trial data, and measuring its accuracy at classifying out-of-sample clinical trials. The predictors were of two types: characteristic (e.g. phase, number of locations, sponsor type, etc.) and “textual”, or features extracted from text based data like the study’s description and eligibility criteria. Some of these features came from a similar tf-idf vectorization process as described in the k-means section above, other features were the simple character lengths of these text blocks. Below is a plot showing the relationship between two of these features: length of the eligibility criteria block of text, and length of the study’s title, two metrics that perhaps get at the complexity of a clinical trial.

complexity

The result: the logit model could only predict correctly whether trials would complete successfully, terminate because of low accrual, or terminate for other reasons 83.6% of the time. This is a pretty small improvement over saying “I think this trial will complete successfully” to every trial you come across, in which case you would be correct 80.6% of the time (see the Completed vs. Terminated pie chart above). Cancer clinical trials are very diverse, so it makes sense that there don’t seem to be any apparent one-size-fits-all solutions to improving patient accrual.

 

Standard
Projects

How our talented team won $2500 at the TechCrunch Disrupt NYC Hackathon

corpsquare_screen1

We had an absolutely amazing and talented team at the TechCrunch Disrupt NYC 2014 Hackathon! Shout outs to our awesome front end designers Amanda Gobaud and Michelle Lee, and our tireless devs, Amine Tourki, Andrew Furman, and Teddy Ku. Here are the lessons that I learned from building a web application that won the $2500 Concur Technologies API first place prize.

  • Our app, CorpSquare (Concur + Foursquare), solved a problem. Several of the team members (me included) used Concur in the companies we worked for. So we had experience with problems or cool and practical use cases that an app designed around the Concur API could do. Even the  Concur VP of Platform Marketing told us afterwards that he had seen many with the problem we were trying to solve.
  • But, we also played the game strategically. Concur is a business expense tracking platform; most of their clients are big businesses. We felt that a business expense API wouldn’t seem as “exciting” or “sexy” as some of the other consumer-facing start-up APIs (Evernote, Weather Underground, to name a few). Since the different companies who sponsored the hackathon had API specific rewards for teams that used their API in the coolest way, this implied that there might be less competition for the Concur API reward. We made a “value” bet of sorts, as value investors would say–the strategy seems to have paid off.
  • Our team’s skills were complementarybut not too much so. A good hackathon team probably needs both design and dev skills, and different people should specialize in one or the other to make things most efficient. But, everyone should be well versed enough in non-specialty skills (like designers in dev, devs in design) to be able to communicate efficiently. For example, our designers were comfortable with both UI/UX design as well front end development like CSS. Several of our developers were full-stack, implementing the back end but also helping out with the front end. We used technologies (frameworks, languages) that we were all comfortable with, which, perhaps out of coincidence for us, was also an advantage.
  • Presentation matters, a lot. Our two wonderful front end designers spearheaded the movement to make our web application beautiful. With the help of everyone, beautiful it was. For the actual 60 second demo, we also selected the most energetic and enthusiastic speakers to present. First impressions matter, but when you’re being explicitly judged in comparison to at least 250 other people, and 60 seconds of talking and app visuals is all you’ve got, first impressions really matter.

Hindsight is 20/20, of course. Causally linking our tactics and strategies to our success is fuzzy at best. But learning never stops; whatever happens, success or failure, there is always something to take away and improve yourself, and others, with.

Standard
Projects

Spreed – the exciting journey so far, and lessons learned

spreed_step2_cropped_raw

Spreed, the speed reading Chrome extension I developed last year to scratch my own itch, recently took off in popularity. People wrote about it in a few different places, and our installs in Chrome went up dramatically. The journey has just begun, but I’ve already learned some lessons that I wanted to share.

Lessons learned

  • Piggybacking on buzz can be an effective technique to increase awareness
    • We piggybacked (not deliberately) on the buzz created by the launch of Spritz, the speed reading startup. People wanted to learn more about speed reading, and came across our Chrome extension when they searched for it. We could have done better if we had optimized our web presence for the keyword “Spritz” after the launch, but my excitement at going from 2k installs to 20k installs in less than 5 days blinded me. Which leads me to my next lesson…
  • Be aware of emotions, instead of letting them take control
    • My excitement at our growth caused me to naively focus on vanity metrics like installs and visits, which blinded me to the SEO opportunity mentioned above.
    • Another example: I recently almost made a grossly sub-optimal decision regarding the outsourcing of development. Again, I let excitement and optimism tempt me to “forget” to use a disciplined decision making approach. The particular one I like to use is called the WRAP technique (pdf), which I learned from the fantastic book Decisive, by the Heath brothers.
  • To quote Steve Jobs: “A lot of times, people don’t know what they want until you show it to them”
    • We’ve not only developed the features that our users have said would be most helpful to them, we’ve developed (and are developing) game changing features that  we anticipate users will find immensely helpful. We test our hypotheses by collecting feedback from users and doing small tests/experiments. The lesson here, I think applicable to all of life, and not just product development: be proactive instead of just reactive.

What has been most exciting has been working with our users to make Spreed the most helpful it can be. Building things that help people, having those people reach out to thank you, and then having conversations with them to make the product even better has been extremely meaningful. Some excerpts from our most enthusiastic and dedicated users:

“Your chrome app is phenomenal. I have been using it for 4 days now, and still find it hard to believe that such a basic app can change one’s life so much.”

“Thank you so much, this has revolutionized my life.”

“I am a dyslexic and I have always had difficulty reading with full comprehension.  I can’t believe how this has changed this for me.  I can read at 350 words with great comprehension.  What happens for dyslexics is the words flow together sometimes forming new words that aren’t there.  With this app I see only the word!  It is going to be a life changer for me.”

There’s still a lot more to do, but I’m looking forward to the future. Learn by doing and building, strive to help others, and the journey will be an exciting one.

Shout out to Ryan Ma for the beautiful redesign of the Spreed Chrome extension!

 

Standard
Projects

Weekend hack: AngelList Alumni Bot

screenshot

Ok, it’s more of a scraper than a “bot”. But the reason I developed it was because I was looking through NYC startups on AngelList, and wanted to find founders who had graduated from my alma mater, the University of Pennsylvania. I didn’t want to click through the AngelList startup pages one by one and then click on every founder. There  was no easy way of doing what I wanted, and I also wanted to get to know the AngelList API a little better.

The AngelList Alumni Bot basically gets all startups given an input city (e.g. NYC), grabs the founder’s name, and checks AngelList or LinkedIn to see if they are a graduate of an input school (e.g. University of Pennsylvania).

There are a lot of areas for improvement (e.g. it’s not a web app, it’s really slow, it currently only supports two cities/locations NYC and SV and one school UPenn, it only grabs one founder for each start-up in a very hacky way by exploiting AngelList page meta tags). You can make contributions to the source code at https://github.com/troyshu/angellistalumnibot.

Everything was done in Python. I used and extended, this AngelList API Python wrapper: my extended version is at https://github.com/troyshu/AngelList.

Standard
Projects

My first webapp built on a framework: wtfconverter

wtfconverter

www.wtfconverter.appspot.com converts between common units of measurement (e.g. liters, seconds, etc) and silly units (e.g. butts, barns, etc.).

It was the first web application that I had developed using a web framework, in this case the webapp2 framework, on Google App Engine. This was two and a half years ago. Before that, I had developed everything from scratch, using PHP and MySQL for the backend.

This introduction to web frameworks intrigued me, and is what jump-started my journey into Ruby on Rails. Pushing local code to the Google App Engine production server and just having the site work blew my mind. Templating (the GAE tutorial taught how to use jinja2) was like magic, creating and managing dynamic content was so much easier.

I started out by following the GAE Python tutorial word for word, which walked the user through actually building a site. Then I developed my own little webapp that was a little more useful and wasn’t much more complicated than what I had learned in the tutorial. This is exactly how I learned Ruby on Rails too: I walked through the Rails tutorial, building a microblogging app along with the author. Then I built my own web app, Pomos the Pomodoro Technique timer, using what I learned from the tutorial. Pomos has since been deprecated, but here’s a screenshot:

pomos

 

Anyways, I learned a lot from following these tutorials where I actually developed something concrete, and then branching off to do my own thing. This is the heart of experiential learning, and what Sal Khan, founder of Khan Academy, talks about in his book One World Schoolhouse; when the student has ownership of his education by actually applying it, e.g. by building something, he is much more likely to enjoy learning new knowledge and skills. But reforming the current state education is a topic for another post.

Standard