About Me

My photo
Rohit is an investor, startup advisor and an Application Modernization Scale Specialist working at Google.

Tuesday, December 3, 2019

Interesting :aws: Reinvent announcements threads for app-modernization

Amazon EventBridge schema registry stores event structure - or schema - in a shared central location and maps those schemas to code for Java, Python, and Typescript so it’s easy to use events as objects in your code.  https://aws.amazon.com/about-aws/whats-new/2019/12/introducing-amazon-eventbridge-schema-registry-now-in-preview/?trk=ls_card

AWS launches new program to drive migrations for end of support Windows Server applications https://aws.amazon.com/about-aws/whats-new/2019/12/aws-launches-program-drive-migration-windows-server/?trk=ls_card

Amazon Managed Apache Cassandra Service - Eat Databricks lunch  https://aws.amazon.com/blogs/aws/new-amazon-managed-apache-cassandra-service-mcs/

ML works across Tensorflow, PyTorch and mxnet - Sagemaker Studio single pane of glass IDE for machine learning, Sagemaker Notebooks - pairs notebooks with compute, Sagemaker Experiments - Tune, compare, visualize, collect & share models and experiments automatically Sagemaker Debugger - Improve accuracy of models, feature prioritization, metrics for model training, SageMaker Model Monitor - detect concept drift SageMaker AutoML with no loss of visibility or control- CSV(data) -> trains 50 different machine learning models  - with a model leaderboard - notebook with all the models & recipes

Amazon CodeGuru : Auto code reviews + performance profiling - driven by machine learning - input handling, aws best practices, latency & cpu utilization, visualize performance - will find the MOST EXPENSIVE line of code in terms of performance. Installed as an agent on the container . :plus:  web hook for pull requests

Friday, November 1, 2019

Spring RestTemplate Buyer Beware!

TL;DR Be vary of the default RestTemplate injected or manually configured in your existing application. You should leverage HTTP Connection pooling for the RestTemplate which may not be turned  on by default. You  can explicitly configure it with the code sample I provided above. Also the Pool defaults are undersized. change those to a number appropriate to your env.  I set them to max 20 per route. tune per load  also configure connection pool stale connection reaping. Instead of the RestTemplate as the Spring docs advise as of Spring Framework 5.0.

TL;DR based on the multiple enterprise engagements … 

  • The default HTTP client connx. pools must be changed before deploy
  • Don’t forget to set ConnectionRequestTimeout  (defaults to infinity)
  • If possible replace RestTemplate/HttpClient with WebClient, else migrate to okHttpClient which has resolved most observed issues;
  • okHttpClient  has excellent connection pool manager, connection failure and timeouts handling mechanisms..

For instance
    //Wont configure the PoolingHttpClientConnectionManager
    public RestTemplate restTemplate() {
        return new RestTemplate();

    // WILL configure the PoolingHttpClientConnectionManager
    public RestTemplate restTemplate(RestTemplateBuilder builder) {
        return builder.build();

For this to work you need to put HTTPClient or the okHttpclient library on the Classpath

Spring apps leverage the org.springframework.web.client.RestTemplate as a synchronous client to perform HTTP requests. The default configuration of the RestTemplate doesn’t use a connection pool to send requests, it uses a SimpleClientHttpRequestFactory that wraps a standard JDK’s HttpURLConnection opening and closing the connection. This is a problem.  BasicHttpClientConnectionManager can be used for a Low Level, Single Threaded Connection

Under load Spring RestTemplate client connections are capped at 4 per route. This *blows up under load*. If you see `HTTP status 500` for requests or slow responses please check the HTTPClient configuration and visit the recommendations

## Recommendations
- If you need to have a connection pooling under rest template then you should use different implementation of the ClientHttpRequestFactory that pools the connections. new RestTemplate(new HttpComponentsClientHttpRequestFactory())

- Use the `PoolingHttpClientConnectionManager` to Get and Manage a Pool of Multithreaded Connections. The defaults of the pooling connection manager too small. You should bump UP the MaxTotal, DefaultMaxPerRoute & MaxPerRoute to 20.

- Maximize the utilization of the HTTP Conn Pool
  -  Implement a Custom Keep Alive Strategy
  -  Configure connection evictions to detect idle and expired connections and close them
 -  Read this article https://www.baeldung.com/httpclient-connection-management for connection management.

HttpClientConnectionManager poolingConnManager
  = new PoolingHttpClientConnectionManager();
CloseableHttpClient client
 = HttpClients.custom().setConnectionManager(poolingConnManager)

also see https://bitbucket.org/asimio/resttemplate-troubleshooting-svc-2/src/master/src/main/java/com/asimio/api/demo/main/ResttemplateTroubleshootingSvc2Application.java

 As of 5.0, the non-blocking, reactive org.springframework.web.reactive.client.WebClient offers a modern alternative to the RestTemplate with efficient support for both sync and async, as well as streaming scenarios. Always use the *Builder to either create a (or more) RestTemplate or WebClient. Dependencies like spring-cloud-sleuth use the customizer/builder resp.  to add additional features

For greenfield apps pick WebClient over RestTemplate. see

 **The RestTemplate will be deprecated in a future version and will not have major new features added going forward. See the WebClient section of the Spring Framework reference documentation for more details and example code**

## Miscellaneous

  1. For slow requests or for goRouter latency follow https://docs.pivotal.io/pivotalcf/2-5/adminguide/troubleshooting_slow_requests.html and Debugging the Cloud Foundry Routing Tier https://www.youtube.com/watch?v=U5GWgabsxXY
  2. If you encounter a customer that is experiencing an application performance issue (increased latency or decreased throughput or slow requests), try having them run this plugin against the app while it’s under load: https://github.com/cloudfoundry/cpu-entitlement-plugin.
  3. If your Application running on TAS is slow, performing poorly, experiencing high latency and/or decreased throughput then follow debug instructions here  https://community.pivotal.io/s/article/Application-running-on-TAS-is-slow-performing-poorly-experiencing-high-latency-and-or-decreased-throughput

Tuesday, October 29, 2019

Architecture & Services Review Template for 360 degree healthcheck of a Microservice

Do you want to review the health of your system of microservices ? Need a checklist of things to look at as you evaluate the architecture and implementation. Take a look at this all encompassing checklist of things to examine the production readiness and scale of your system of microservices. 

  • Libraries
    • How many unused libraries are there?
    • Are there any libraries that could be replaced by features included with Spring?
  • Connection Pooling
    • How is concurrency handled ?
  • Latency
    • How long does the app take to start up?
    • Is there a meaningful difference in data transmission speed with a high load when using rsockets vs. https?
    • Is there a meaningful difference in data transmission speed when using a reactive tech stack vs. a traditional tech stack?
    • Are there any noticeable areas with inefficient HTTP calls?
    • What is the average response time for the app's network calls?
  • Memory/CPU
    • How much memory does the app use under a high load?. Does it need JVM GC tuning ?
    • How many threads does the app use under a high load?
    • What is the top constraint ? (CPU. Mem, Disk, Network,)
  • Error/Exception Handling
    • How many exceptions does the app usually throw under a high load?
    • What is the mean time between failures?
    • How long does an outage usually last?
  • Code Complexity/Cleanliness
    • What is the highest level of cyclomatic complexity within the app?
    • How many unused classes are in the app?
    • How many unused methods are in the app?
    • Compliance with 15 Factors ?
    • High frequency of code change heat map
    • Sev 1 Production Incidents Review
  • Spring
    • Is there Classpath dependency bloat ?
    • Upgrade to s-boot 2.2 and concomitant dependencies possible ?
  • Resiliency
    • Are circuit breakers and HTTPClients configured correctly
    • Are metrics from Circuit Breakers put in the firehose via micrometer
    • Failure Mode analysis.
  • Observability
    • Are applications logging at the right level
    • Are applications emitting metrics at the right level
    • Is spring-cloud-sleuth enabled for distributed traces ?
    • Configure http healthchecks for the app in Cloud Foundry
  • Performance
    • Is application startup time acceptable. Can this be reduced.
    • Is autoscaling behavior understood in context of downstream dependencies.
    • Policy for autoscaling up and down
  • Higher level Architecture Review

Sunday, October 27, 2019

How do you get Threaddumps and Heapdumps for Java applications running in Cloud Foundry ??

You Cannot.!!  You have hit a classical pain point due to the Java Buildpack using a JRE and not a full JDK .

So the issue is that you cannot cf ssh into the container in PCF and use the jcmd command to trigger a java threaddump. The classical way of resolving high CPU is to take three such threaddumps 30 seconds apart and check to see the threads that are stuck, ones that are not moving or contending on locks or deadlocks etc. You pair this with CPU Profiling information in the VM

NOT able to take a threaddump in PCF is frustrating. WAS/Weblogic had excellent support for getting these artifacts via must-gathers.

So what can you do ? 
You cannot invoke the /threaddump actuator endpoint because that does not provide nearly as much info as a classical threaddump will provide. 

Again this is a problem that anyone who wants to use the JDK tools in an app in PCF faces. Like for instance we want to run the javac command inside the app in PCF. We simply can't due to the above mentioned issue. 

OK So what can be done ... 
A one time custom java buildpack is created rebased on an Open full JDK and not a JRE. This is not sustainable in the long term.  You will need to restage the app with this custom Full JDK Java buildpack. 
- The JDK tooling (jcmd, jmap and other command line tools) have to be trojan horsed into the app via a side-car container or something like a pcfshell https://github.com/tfynes-pivotal/pcfshell or the app has to carry the executable with it. 
- Another option is that app itself carries a /threaddump endpoint via a spring boot actuator although if the app is dying due to OOM or high CPU this seldom works
- If the app is crashing due to an OOM it writes out a histogram and a cause of failure. In such a case enable verbose GC logging to stdout so that you can collect and visualize the GC logs and 2. you can configure a persistent volume bind for the Java buildpack to write the core file to a persistent volume oom-killer  jre-docs
Existence of a single bound Volume Service will result in Terminal heap dumps being written.
- Use flame graphs in PCF to debug high CPU. This requires some investigation. 

Thursday, October 24, 2019

Migrate away from IBM Integration Bus

Monolith  ---------------------> DB

Step 1.
Monolith  ----------> DB |  ACL |  Microservice1     ------> new DB (Read only)
 - all data is migrated in read-through from old DB to new DB via ACL
 - Migrate 90% of the data like this
 - newDB <-- sync --> oldDB [needs Synchronization]


Migrate off of IIB
 - Rapid migration off of IIB
 - Take the custom code > wrap it in s-boot and don't change the data model
 - No local database .. all IIB converted apps talking to data model
 - [x] Conformist pattern ... then no ACL
 - [y] Evolve the API and add new consumers then add ACL and use your domain model. no database.

 (1) - Proxy off of IIB <READ>
 (2) - Selective re-examine data strategy based on the app <READ|WRITE>



The theory of domain driven design created by Eric Evans in his seminal book  Domain Driven Design (DDD) was published in August 30, 2003. Now DDD is such a dense tome that it requires an average senior software engineer two tries to read after which you wonder how exactly you apply this to running software. Thereafter you start reading the Vaughn Vernon's red book - Implementing DDD to figure out the implementation of these patterns in code. At this point most software engineers are still struggling to apply the tactical and strategic patterns of domain driven design. This was the state of DDD around 2010.

Enter Adrian Cockcroft of Netflix fame who along with Martin Fowler sparked the microservices revolution and a renewed interest in DDD as the theoretical underpinning for microservices. This jazzed up everyone, since no one had a clue about how to structure the boundaries and domain for individual microservices. What is the correct way to design the boundaries of your services ? Bounded Contexts and sub-domains and context maps from DDD came to the rescue. and provided a basis to structure your system of microservices.

So we have the theory to now split and structure microservices. This again is NOT Enough. A lot of architects floundered in trying to figure out how to transform from a massive monolith to an event driven choreographed system of microservices. They struggled defining the subdomains and the bounded contexts.

Enter Alberto Brandolini who figured out that Event Storming is the only way that merges the people and technical aspects, the tactical and strategic aspects to visualize domains. Event Storming is a cross functional facilitation technique for revealing the bounded contexts, microservices, vertical Slices, trouble spots and starting points for a system or business process. Event Storming and other techniques that I mention later allowed architects and product owners to practice strategic DDD. There was no systematic way to practice strategic DDD to software before event storming. Alberto's influence on DDD in seminal as one who democratized it and made it available to masses. There is also an analogy to legos here. In the 1950's legos were introduced in Scandinavia, however their sales were struggling. Lego was far rom the powerhouse it was today.  It is only when the Lego company started shipping instructions to build the lego sets, that sales took off. This is key  - the packaging, the instructions that allowed a seven year old to build the set on their own and get a sense of accomplishment without nagging his/her parents.

Now having practiced event storming for a couple of years , we realized that incremental notation is key. If you get stuck on creating a color coordinated combination of aggregates, commands and events , read models, UI, data and policies it can pigeon hole and restrict the domain model and lead to a tunnel effect. Domain Events are front and center for event storming, everything else is secondary and needs to be added incrementally.  The gap from event storming a system to an actual backlog of stories is HUGE. If you follow the textbook definitions of event storming you end up an event sourced CQRS system which most developers struggle to implement and maintain. A mistake i have lived with. There are multiple forms of event driven architecture and choreography what-is-event-driven. We want to start with the easier incantations and then graduate to the top level of a full blown event sourced system.

This led to the  SWIFT method that leverages a technique called Boris that uses graph theory to model the relationships between the capabilities in a system. This process generates information about how the system "wants to be designed" and attempts to avoid pitfalls such as premature solutioning. At the end of a Boris Exercise, Services, APIs, Data and Event Choreography and a backlog of work starts becoming obvious.

After Boris it is critical to run multiple modeling exercises and then determine MVPs for the vertical flows. In order to determine the right MVPs for your system you have to consider thin vertical end to end slices where these domains interact with one another. You have to prioritize the thin slices based on technical effort, risk and business value. The slices encompass a sub  section of events . The MVPs map a path from strangling the monolith and leveraging tactical patterns to interact with the new domains and services. There are multiple techniques that can be applied here including domain story telling and user story impact mapping.

Vertical slices are identified by choosing short, domain event flows in the core domain and defining the architectural components required to produce those events. Slice by slice, translate the domain model into microservices that use APIs, message queues, etc., that will run on the platform. Finally, a set of user stories is defined and mapped to releases or MVP.

After performing the mapping of user stories that realize the tactical patterns to MVPs, we now have a concrete backlog that developers can start with and iterate on.

Here is the full sequence of steps that help with decomposing a monolith:
1. Define Objectives and Key Results (OKR) for the app modernization effort.
2. Event storm the application and identify bounded contexts.
3. Pick several short domain event flows in the core domain that constitute a vertical slice.
4. Create Boris diagrams that define the relationships between the domains for an end to end slice
5. Perform SNAP analysis to score the effort and define data, API and messaging interfaces
6. Create a backlog of prioritize user stories tied back to OKR.
7. Impact Map user stories to MVP or releases.

At this point the benefits and promises of DDD become real and the theory of DDD laid out by Eric Evans becomes a pragmatic living software system.

Happy Modeling.
Rohit Kelapure

Friday, September 20, 2019

How to tell Application Containers Running Java Apps to Trust Self-Signed Certs or a Private or Internal CA

Your 15th google search of "Cloud Foundry SSL Handshake exception, PKIX validation error, Hostname validation failed, Identity cannot be ascertained" has failed you. 
Talking two way SSL securely to other external services  when your app is running in PCF is always a headache in every engagement.

Typically this problem surfaces like this -  What is the best practice to add trusted certificate for an app when pushing it to PCF? It needs to talk to an internal service over HTTPS and we need to make sure that it trusts the certificate of that service ?

Another way this question can be posed is How to tell Application Containers Running Java Apps to Trust Self-Signed Certs or a Private or Internal CA ?

I outline all the solutions here to solve this problem from the proper way to straight cheating on this issue . Strap  in its gonna be a fun ride! 

1. Embed the truststore certificate into the platform. This is the least intrusive way for the apps. The certificates get baked into the the  /etc/ssl/certs folder of the Diego container. If you are using Java build pack version 3.12+ or 4+, then the Java buildpack will automatically load these trusted certs. Injecting certs into the platform using this option will also help non-Java applications deployed on the foundation. This option requires PCF operator privilege to run "Apply Change" on Ops Manager. see https://docs.pivotal.io/pivotalcf/2-4/devguide/deploy-apps/trusted-system-certificates.html Basically with this technique you are making the certificate a trusted system certificate. This works for both java and non-java apps. 

2. If you  are running a Docker container then and one of the constraints is that your client needs credentials at a specific location /etc/opts . Then this location is  not writable in a buildpack based container which is why you are using Docker. When using Docker you don't get the Java Buildpack Certificate Client Mapper magic which automatically adds certs in /etc/ssl/certs to the JVM trust store. You have to manage the injection of certs in the JVM yourself.  The best way to do this is in the Docker file. This is the Dockerfile with which @Aniruth Parthasarathy  got  working and the sample repository app that talks to the Cloud AWS HSM. https://github.com/aniruthmp/demohsm. The TL;DR here is that you have import the certificates from a known location in the container using the keytool utility provided by the JVM. The Dockerfile that Ani wrote is pure gold in terms of creating the right cert environment for the JVM to run.

3. You can specify the truststore and the keystore as environment variables
    JAVA_OPTS: '-Djavax.net.ssl.TrustStore=classpath:resources/config/truststore'
    JAVA_OPTS: '-Djavax.net.ssl.TrustStore=file:/home/vcap/app/BOOT-INF/classes/kafka.client.truststore.jks'

I don't like this technique since it now replaces the entire truststore. The truststore should ideally contain all certificates including the system ones. Use this one with caution.

4. Sometimes you have the luxury of specifying a truststore in spring boot or spring cloud stream properties like if you are talking to Kafka over SSL. In these cases bundle your *.jks file in src/main/resources and specify the following properties to load the truststore correctly. This needs to the exact location in your diego garden container as unpacked by the java buildpack.

      ssl.truststore.location: /home/vcap/app/BOOT-INF/classes/kafka.client.truststore.jks
      ssl.truststore.password: changeit
      ssl.endpoint.identification.algorithm: ''

5. Straight up cheating. Include the https://github.com/pivotal-cf/cloudfoundry-certificate-truster dependency in your Java Project 


Configure the TRUST_CERTS or the CF_TARGET environment variable.

Certificates can be specified by either or both of the following environment variables:
This will cause CloudFoundryCertificateTruster to download the certificate at api.my-cf-domain.com:443 and add it to the JVM’s truststore.

This will cause CloudFoundryCertificateTruster to download the certificates at api.foo.com:443 and api.bar.com:8443 and add them to the JVM’s truststore. You can specify one or more comma separated hostnames, optionally with a port.

Be very careful with this technique as you will trust ALL certs from the specified endpoint. Remember the trust is only triggered when a call is made to the external api.foo.com endpoint.

If you cannot include the dependency in your project then just copy the src code into your project and edit it to include the domains. It is just two files.

As a bonus you have the same issue with your app on Kubernetes then follow the excellent example created by @Robert Voorhees https://github.com/voor/certificate-example

Yeay!! freedom from googling SSL Handshake and PKIX Certificate validation path or other SSL host and dns verification exceptions!!!


Saturday, September 14, 2019

Driving Product Market Fit

The Startup School Podcast episode-1 from YCombinator has gold nuggets on how to drive for product market fit and iterate to the right MVPs that deliver user value. I really liked the second part of the first podcast where Woofoo founder Kevin Hale talks about the asking the right questions to the users. I pasting snippets below from the Podcast that are extremely valuable for any startup founder.

At the core of a great user interview, you need to learn about their life, you need to talk about specifics around the problem area that you're trying to solve that the user may be going through. 

Second mistake that we pretty much all make is we talk about hypotheticals. We talk about what our product could be. We talk about features that we want to build.
I love to ask questions that extract numerical answers to three facts about the customer that I'm working with. 1. How much does this problem cost them today? I like to get a hard number, either in terms of how much revenue do they stand to earn if they solve this problem. 2. How frequently do they encounter this problem? Do they encounter it on an hourly basis, a daily basis, quarterly basis, yearly basis? The best problems that startups can target are ones that are encountered more frequently. 3. How large is their budget for solving this problem and who has this budget. You can imagine that say you're solving something for an industrial assembly line, a problem on the industrial assembly line. If you're talking to the operator, the person who's actually there on the kind of the assembly line, they may encounter this problem on a really regular basis, but they just don't have the budget, they don't have the authority to actually solve the problem. That's their boss or that's someone above them in the office or in the headquarters.  
Rahul Vora from Superhuman describes a process where on a weekly basis he asks pretty much all his customers, but it doesn't even have to be your entire customer base, it could just be 30, 40 users, a critical question, how would you feel if you could no longer use Superhuman? Three answers, very disappointed, somewhat disappointed, not disappointed. He measured the percentage of users who answered the questions very disappointed. These are the users who most value your product. These are the users who your product has now become a key part of their life, it's kind of weaseled their way into their daily habits. If 40% or more of your user base reports that they would be very disappointed if your product went away on a weekly basis, that that's kind of the signal, that's the differentiation point that it says if you get past this point, your product will just grow exponentially. https://firstround.com/review/how-superhuman-built-an-engine-to-find-product-market-fit/

Here is a full link to the transcript reposted in a more consumable format from the YC Blog. Enjoy!

#142 - Startup School Week 1 Recap: Kevin Hale and Eric Migicovsky

Craig Cannon
Hey, how's it going? This is Craig Cannon, and you're listening to Y Combinator's podcast. This week we're going to try something different. I'm going to recap the first week of Startup School. I've cut down the first week of lectures to be even shorter and combined them into one podcast. First, we'll have a lecture from Kevin Hale. Kevin is a YC partner and a cofounder of Wufoo. His lecture is about how to evaluate startup ideas. Then we'll have a lecture from Eric Migicovsky. Eric is a YC partner and the founder of Pebble.
Craig Cannon
His lecture is about how to talk to users. We put out a lot of content at YC and the Startup School lectures are some of the most valuable things we create, so check these out and let us know what you think.
Kevin Hale
This is how to evaluate startup ideas, and this is actually a new set of content that we've developed based on a lot of feedback that we saw from the last Startup School and what we noticed is a lot of people's challenges. Last year's curriculum actually had a lot of content that ended up being, when we looked at the data for who's participating in Startup School was like, "Oh, this is much more advanced, it's much further along." A lot of people, for instance, like I had no idea
Kevin Hale
or like I have too many ideas that they don't know which one to pursue as a main reason why a lot of people are only able to work on their startup sometimes part-time. Yes, they might be stuck without resources, but they didn't have conviction, they didn't know like, "Oh, what would I have to believe in order to say like I want to quit my job?" This is also a really great sort of skill to sort of have because if you are realizing you need to pivot, how do you evaluate if you need to do that,
Kevin Hale
and then also if you're pivoting to something else, like how do you evaluate whether something is worth going to? And if you already have a launched company then you might have problems with like, why isn't this growing or how do I improve it? And evaluating your startup, especially in the way that sort of investors evaluate startups ideas we find is going to be really, really useful. How can I predict if an investor will like my idea? That's ultimately what we're trying to figure out. And the answer's really easy.
Kevin Hale
For us at YC, the definition of a startup is a company that is designed or created to try to grow very quickly. If you're not trying to build a company that grows very, very fast, then you're just building a normal company, it's a small business, and there's nothing wrong with that, but these companies are the ones that investors are interested in. If you're hoping to build something that will have tons of users, that will have huge valuations, that will be able to attract venture funding,
Kevin Hale
then the evidence that we want is evidence that shows that your company can grow quickly. A startup idea is basically a hypothesis, and this is the way you should think about it. It's a hypothesis about why a company could grow quickly. Your job is to figure out how to construct your hypothesis, basically the pitch, to the investor so they understand how it can grow quickly. A lot of times people make the mistakes of trying to just accurately describe or over describe a lot of different parts. So I'm going to break this down.
Kevin Hale
Just like a normal hypothesis has a pretty decent structure for this, this will hopefully help you sort of workshop like understanding, "Oh, this is exactly all the reasons why this should succeed." Even before we start even building anything, we can have an understanding of like, "Oh, here's the potential path of this company," or "Here's the things I need to prove to show that this company could do well." The first is the problem. So startup ideas come as three parts, the first part is a problem, and it's basically
Kevin Hale
the initial conditions, you have to explain to me like what is the setting for this company that allows it to be able to grow quickly? The second is the solution. This is basically what is the experiment that you're basically running within those conditions for it to grow really quickly? The third is, what's your insight? So what's your explanation why the thing that you're going to try, your experiment is going to end up being successful? Those are the three components that I'm always trying to figure out
Kevin Hale
when I'm listening to someone's pitch. Here's a tip for talking about the problem or to know whether your problem, your initial conditions are correct. The first is, good problems, they're popular, so a lot of people have the problem. You want to avoid problems that there's a small number of people that have it. We like problems that are growing, so therefore the market basically like is it growing at a rate that more and more people are going to be having the problem and it's growing faster than other people's
Kevin Hale
or other types of problems? We like problems that are urgent, ones that need to be solved very, very quickly. We like problems that are really expensive to solve, because if you're able to sort of solve it, then you can charge a lot of money potentially. We like problems that are mandatory, right, so therefore it's like people have this problem and they have to solve it. We like problems that are frequent, ones that people are going to encounter over and over and over again and often in a frequent time interval.
Kevin Hale
What you want to have is like some aspect of the problem that you're working on, at least one of them, and it's ideal if you have multiple of them. You don't have to have all of them, but it's one of those things where it's like, if your company isn't growing or if someone's not as excited about the problem, it's probably missing some of these characteristics. Our ideal problems are millions of users, right, millions of people have it. That's why people like to work on consumer companies.
Kevin Hale
It's why some investors like to focus on them. We like markets that are growing 20% a year, the problem is growing quickly. We like problems where people are trying to solve it right now, immediately. We like problems that just cost a ton of money, so billions of dollars, right, or at least they all add up to some billion-dollar total adjustable market. We like problems where the law has changed, the law has changed and regulation is put there and now people have to solve a bunch of problems. You saw a ton of healthcare startups
Kevin Hale
were born after Affordable Care Act was passed. A lot of that had to do with like there was now all of a sudden opportunity. This problem that all these hospitals and clinics had to solve. We like problems that people need to solve multiple times a day or will use it multiple times a day. Facebook's a good classic example, but people also really love Slack, right, because it's like, "Oh, I'm going to be engaged and using it multiple times a day during the workday." Solution, so there's pretty much only one piece
Kevin Hale
of advice I really have for the solution, that's the best advice that you can ever follow, and that is don't start here. What I mean by that is at YC we have an acronym for a problem that we try to avoid or basically an application that we have to go like, oh man, I wish they had started with the problem first, and we call it SISP, it means solution in search of a problem. often what happens is like you're an engineer, you're excited about technology, some new technology has come on the scene. Let's say it's blockchain, right.
Kevin Hale
Let's say its like React Native or whatever the new thing is, and you're like, I want to build something with this. It's a large reason you start working on a startup project. And then you go like, "Okay, what kind of problem can I solve now? I want to use this no matter what," and then you try to shoehorn a problem into the solution. What ends up happening is that's a much more difficult way to grow the company. It's not impossible for companies to grow this way but super inefficient. It's much better to be like, let me see
Kevin Hale
what problems people have and then I'll use whatever is necessary to solve them, and therefore it's much more likely that you will grow as a result. The last one's a little tricky, it's, what is the insight? What's the reason why this solution is going to work? And this is where a lot of companies start to get tripped up. Because it's really about like what is your company's unfair advantage, right? Why are you going to win versus everyone else? Why are you going to be the fastest one to sort of grow?
Kevin Hale
Because that insight is what's needed for the investor to choose you over anyone else, and it has to be related to growth. You have to have an unfair advantage that explains why you're going to grow quickly. If it's not related to that, then it's not going to be something that an investor is going to find valuable, right? Let's go through the types of unfair advantages that your company have. There's five different types and companies do not have all of them. Really great ones, not surprisingly, will have all of them
Kevin Hale
and we'll go through two examples, but you want at least one, and it's nice if you can have two or three. But for most of you it's probably just one. So, the first one, so how do you know if you have a founder unfair advantage. And so all of these will be connected to numbers actually, which will help this make this really easy. Is like are you one in 10 of all the people in the world who can solve this problem? Are you a super expert? And 99% of the people we find at YC do not fall into that category.
Kevin Hale
And so if you think it's like, well, I'm a product manager at Google, there's a lot of product managers at Google. If you say you're an engineer at Microsoft, there's a lot of engineers at Microsoft. It's like, great, but it's not one that will make me think, oh, you have a greater unfair advantage than someone else. If you've done a PhD, and let's say you've done it on some kind of crazy biotech research and you have like a special patent to be able to cure some kind of disease, then you have a founder advantage.
Kevin Hale
Your market, is it growing 20% a year? By default if you just build the solution in the space, you should just automatically grow, because you're just following a trend. If this is your only company advantage, then it's one of the weakest ones that you could have. It is great to be in that space, but you want to have something in addition to this. Like you're going to do like better than average because you've picked the right problem space and the right set of customers that want your problem,
Kevin Hale
but, again, if you're in a market that is stagnating or shrinking, then you're going to have investors worried about the long-term viability of your company as a result. Product, this is super simple, is your product 10X better than the competition? If it is, then you potentially have an unfair advantage. It has to be very, very clear, someone should be able to look at your product and go like, oh shit, this is so much better than everything else I've ever seen, it is 10X faster, it is 10X cheaper, et cetera.
Kevin Hale
If it's not an order of magnitude, let's say it's just like 2X or 3X, again, that's nice, but it's not enough for an investor to go like, "Oh, this is a slam dunk." Acquisition, so a lot of people think that if you go to an investor and you've done a bunch of Facebook or Twitter or Google ads and you show your CAC and a LTV, that you will able to prove that you have a sustainable sort of acquisition model. I want you to know that if paid acquisition is the only way that you are able to grow your company,
Kevin Hale
then I'm going to discount that channel of growth greatly. That is because if you actually get really popular, you actually start being someone significant, let's say becoming a $100 million revenue company, then you're going to attract a lot of competitors in the space. That advantage is going to quickly dwindle over time. Blue Apron is a really good example of this, almost all their acquisitions in paid and then once they ate through that, there was almost nowhere else for them to sort of go.
Kevin Hale
You want to find acquisition paths that cost no money. My favorite companies, the ones that become really great are the ones that can grow by word of mouth, this is a good percentage of the way they grow. In the early days of your startup, if you don't have any money, that's actually very great way of exercising how do I grow this without having to pay for it? In the beginning we tell you to do things that don't scale, but this is what you sort of want to accomplish is like, do I have an advantage that is free?
Kevin Hale
The last one is, do you have a monopoly? We don't mean this in the monocle Monopoly game sense, so we mean it as like as your company grows, is it more difficult for you to be defeated by competitors? Do you get stronger? And so good examples of that are like companies with network effects at marketplaces, where marketplaces where it tends to be a winner-takes-all, a one company will tend to win. Network effects is just basically as my network grows, the strength of my company and the value of the product or service
Kevin Hale
also grows with it. Not every company has it, but when you do have that, works out great. There's something to keep in mind also, that's other things I'm looking to believe about a company, and that is something that trips up a lot of founders. There's two types of beliefs that I have about a company. And so there's the threshold belief, which is like, what's the default just for them to even succeed? So oftentimes for me it's like, "Oh, them building it, can they even build it?" That's a threshold belief.
Kevin Hale
If they can't even build it, none of it even matters. And so to me that question is not the most important. What will determine whether I'm going to win the lotto is a miracle belief that like, "Oh my God, if I believe that they can do this, that actually going to be able to take off really well." And sometimes they're really simple. So if you are a heavy engineering team or doing a BDB or enterprise startup, again, the default is you have to build it. So if you can't even build it, then it's not even going to work.
Kevin Hale
I don't spend actually a lot of time looking at that. For me, I'm trying to figure out success will be determined by how well you can do sales, how well you can tell the story, how well you can actually convince customers and work through a sales process. I want evidence that shows that you know how to work through that and make that happen. All of work with most of those companies is like not working on product, it's like, "Hey, all right, let's prove this other thing that if you have that, that'll be the thing
Kevin Hale
that actually will help people go like, 'Oh shit, they have the super combo.'"
Craig Cannon
All right, now for Eric's lecture on how to talk to users.
Eric Migicovsky
Hi everyone, my name's Eric Migicovsky, I'm a partner here at YC. I actually started a company that went through Y Combinator back in 2011. I started a company called Pebble. We made one of the first smart watches. I am really excited to be here to talk about talking to users, because this is one of the perennial things that you always hear about as one of the critical factors in starting a company. The best founders maintain a direct connection to their users throughout the lifespan of their entire company.
Eric Migicovsky
They maintain a direct connection because they need to extract information from their users at all different stages of running their company. Oftentimes people think that they're the CEO or they're the CTO, they're the technical kind of product leads of the company, they can outsource this research to other people in their company. They can hire salespeople, they can hire heads of product. But at the core, the best companies are the ones where the founders themselves maintain a direct connection to their users.
Eric Migicovsky
If you are the CEO, it is your job, it is in your job description to talk to customers. Take the time to learn how to do it well. All founders need to participate in this process as well. If you're the engineer, if you're the developer, don't think that you can escape this process just because you're the person who's coding. There's a pretty classic scene from the movie Office Space where there's an individual who says, "I'm the person who is the go-between between engineers and users, I know how to talk to people,
Eric Migicovsky
I have people skills." That is one of the things that you do not want to have happen at your company. You want to make sure that the founders and the core members of your company are the ones who develop the skills for talking to users so you do not have to hire someone like that to be the go-between. Talking to users is so critical that at the core of kind of YC's teachings there are only two things that you must do in order to start your company. You need to code or build your product and talk to users.
Eric Migicovsky
The mom test, as Rob actually explains, is three common errors that we make when we try to conduct user interviews. The first problem, the first mistake that we pretty much all make is we talk about our idea. We're founders, we love to pitch our idea, we love to talk about the product that we're working on, but during a user interview, that is not the time to be pitching the product. The goal of a great user interview is to extract information from the person that you're talking to, to extract data
Eric Migicovsky
that will help you improve the product or improve your marketing or improve your positioning. It is not to sell them on using your product. At the core of a great user interview, you need to learn about their life, you need to talk about specifics around the problem area that you're trying to solve that the user may be going through. Second mistake that we pretty much all make is we talk about hypotheticals. We talk about what our product could be. We talk about features that we want to build.
Eric Migicovsky
We ask questions like, "If we built this feature, would you be interested in using it or would you be interested in paying for it?" That is wrong. Instead, talk about specifics that have already occurred in the user's life. This will give you stronger and better information in which to make product and company-changing decisions. You also want to talk in general about the user's life. You don't want to just talk about the specific problem or, sorry, the specific solution that you're presenting.
Eric Migicovsky
Try to extract information about the users, the path that led them to encounter that problem. Ask them questions about their life in kind of more broader ways to extract context around how they arrived at this problem. Learn about their motivations. Learn about why they got themselves into that problem in the first place. The third trap that we pretty much all fall into is that we talk, we talk a lot. We're founders, we're always pitching investors, we're pitching employees, we're trying to hire people, we're trying to partner.
Eric Migicovsky
We tend to spend a lot of our time talking. In a user interview, try to restrain your interest in talking, really listen, take notes and listen to what the user's saying because in that span of time, the 10, 20, 30 minutes that you spend with the user, you're trying to extract as much information as possible so that when you return to the office and when you return to your cofounders you're bringing hard data, real facts about users' lives to the table. There are five great questions that everyone can ask during
Eric Migicovsky
their early customer interviews. The first question is, what is the hardest part about doing the thing that you're trying to solve? In general, the best startups are looking for problems that people face on a regular basis, or that they're painful enough to warrant solving. This question can help confirm for you whether the problem that you're actually, or the problem that you're working on is actually one that real users feel is a pain point, feel is something that they actively want to solve in their life.
Eric Migicovsky
The second question, to the point that I was making earlier about trying to get to specifics rather than hypotheticals is to ask the question, tell me about the last time that you encountered this problem. The goal of this question is actually to extract context around the circumstances in which the user encountered that problem. The third question is, why was this hard? The reason why you want to ask this question is because you'll hear many different things from different people. Going back to the Dropbox example,
Eric Migicovsky
you might encounter some people who say that maybe the problem, maybe the number one problem that they were encountering was when they emailed files back and forth, they ended up duplicating work because they didn't have the exact same kind of document at the exact same time. Maybe other people will say that they submitted the wrong document in the end to the professor for their group project because they had like crazy strings of file version numbers on the end. The benefit from asking this question
Eric Migicovsky
is not just to identify the exact problem that you may begin to solve with your solution to this problem, but you'll also begin to understand how you market your product, how you explain to new potential users the value or the benefits of your solution. In general, customers don't buy what. They don't buy the what, they buy the why. In the Dropbox example, they may not be excited and overjoyed at saying, "Oh, I now have this kind of file syncing tool that can keep all my files in sync," but by the why they'll say,
Eric Migicovsky
"Well, this product will help with this exact problem that I had just two weeks ago when I was trying to work on a student project with some of my friends." Answers that you get from customers to this question of why, why was this past problem that you encountered so hard, may actually inform your marketing or your sales copy as you build out the rest of your kind of product. Fourth question is, what, if anything, have you done to try to solve this problem? One of the biggest things that I've encountered
Eric Migicovsky
while helping YC companies over the last few years is that if customers, if potential customers are not already exploring potential solutions to their problem, it's possible that the problem that you're trying to solve is not a burning enough problem for customers, for them to be even interested in your better solution to this product. This question tries to get at that root of that issue. Is the person who encounters this problem already trying to solve this? You want to ask this question for two reasons,
Eric Migicovsky
one is to figure out whether the problem that you're solving or you're working to solve is even really something that people are already looking for solutions to. And the second one is, what are the other competition out there? What will your product be compared against as you end up rolling out your solution and offering it to end customers? The fifth question is very tactical, it's what don't you love about the solutions that you've already tried? This is the beginning of your potential feature set. This is how you begin understanding
Eric Migicovsky
what the features are that you'll build out for your better solution to the problem. Now, note that this is not the question of, what features would you want out of a new file syncing product in the Dropbox example? Because that's a hypothetical question. Users in general are not great at identifying the next features that they want in the product. Just like the old Henry Ford quote, when we were developing the automobile, our users would have wanted a faster horse rather than car. This question specifically targets
Eric Migicovsky
what are the problems with the existing solutions that they've already tried? These are specifics, and you can begin to kind of figure out what the differential between your new solution and the existing solutions already on the market will be. Talking to users, as I said before, is useful at pretty much all stages of your company, but there's three critical phases to an early stage company, I would kind of define that as a company that has not yet reached product-market fit in which talking to users would be extremely beneficial.
Eric Migicovsky
Those three stages are at the idea stage, before you've even begun developing any of your product, at the prototype stage where you have the first kind of rough beginnings of your product but you haven't really gotten in the hands of any paying customers or any users yet. The third one, which is after you've launched and you're iterating towards product-market fit, how do you guide that journey? I'll talk about a few tips for each phase. At the idea stage, you may have the back of a napkin idea,
Eric Migicovsky
you may have a thought, you may be commercializing some technology that you've been dreaming of but you don't yet have any first users. You need to begin finding the first people that will be interested in either providing information about the problem that they've encountered or potentially signing up to be first users. People come to me and ask, how can I talk and how can I find my first users? Honestly, some of the best companies are products or services that are built for the founders themself, so start with yourself.
Eric Migicovsky
Begin, like test your user interview strategy on yourself. Try to walk through a situation where you've encountered that problem. The next step after that is to talk to friends, is to talk to coworkers to get warm introductions. It doesn't take a lot of people, you don't have to talk to thousands of people. Every good user kind of research strategy begins with just one or two people. The critical feature here is executing a unbiased and detailed customer or user interview strategy rather than just trying to pitch your idea to them.
Eric Migicovsky
Another cool hack that we've seen some great success with, actually YC Company in this batch is using this to, YC Company in this batch is actually selling products to firefighters, and they realize that cold email introductions was just not working, was not a way that they could get through to customers. What they did was they actually just dropped by fire stations in person, they didn't even email them to say that they were coming ahead of time, they just showed up and they said, "Hey, could we speak to the fire chief,
Eric Migicovsky
could we talk to someone about this problem that we've got a solution to?" You know what, it worked great. They managed to get dozens of in-person, 10 to 15 minute long meetings just by showing up. When in doubt, if there's a specific target customer base that you're looking to get feedback from, just try showing up. It feels a little bit weird because it feels like you're imposing on someone, but at the end of the day, the mindset that I like to get into is, if you truly think
Eric Migicovsky
that you're solving a problem that your target customer base is facing, you'll actually be doing them a hand, you'll be helping them out by taking their 15 minutes and learning more about the problem. Industry events are another great way to get a number of new customer interactions. I remember that when I was working on Pebble, we actually went to CES, which is this large consumer electronics show in Vegas. We didn't have a booth. We just went in guerrilla style, we just like randomly started setting up meetings
Eric Migicovsky
with potential users and we met them in like the coffee shop outside of the conference. We did that for zero dollars without any sort of marketing budget, just because that was where a lot of people in the industry were, and we knew that there was like a high concentration of potential people that we could talk to. Some tips for this stage, take notes, take detailed notes, because like I said before, you'll never know until later which key facts of these user interviews may be useful. If you're not great at taking notes
Eric Migicovsky
while you're talking to someone, bring a friend, bring a cofounder, ask the person if you could record it. When in doubt, capture as much information as possible. Keep it casual. Like I said before, you could just show up, you don't have to like pre-plan this. You don't have to have 20-minute blocks on your calendar scheduled for days on end of user interviews. Feel free to react, like honestly you'll learn so much through the first five or 10 user interviews that your process will dramatically improve
Eric Migicovsky
from those first interviews to the next batch. Don't feel like you have to do 100 user interviews all at the same time, just start with one, start with three, start with five, 'til you get the hang of it. The third thing is you need to be cognizant of the other person's time. Again, going back to what I said at the beginning, we love our idea, we're founders, we love talking about our idea. You need to keep yourself in check and make sure that you're cognizant of the other person's time. Honestly, you'll be able to get
Eric Migicovsky
probably the best information out of, say, a 10 to 15-minute long first interview, and that might be all the time you need just for that initial chat. As you move past the idea stage into testing your prototype with users, the next major kind of benefit that you can get from talking to users is figuring out who will be your best first customer. This is critical because it's possible that if you choose the wrong first customer that you may be led down a path that constrains you or artificially traps you without actually getting paid
Eric Migicovsky
by that first customer. We've created a framework that you can use to begin to identify before you begin working with them who the best first customers will be. During user interviews at this stage, I love to ask questions that extract numerical answers to three facts about the customer that I'm working with. The first one that I want to get to the bottom of is, how much does this problem cost them today? I like to get a hard number, either in terms of how much revenue do they stand to earn if they solve this problem
Eric Migicovsky
or how much expense do they currently spend trying to solve this problem, how much money is wasted today as they try to solve this problem? The second one that I like to get to the bottom of is how frequently do they encounter this problem? Do they encounter it on an hourly basis, a daily basis, quarterly basis, yearly basis? The best problems that startups can target are ones that are encountered more frequently. This is usually beneficial for two reasons. One is they encounter a problem on a more regular basis.
Eric Migicovsky
It means that the customer's feeling the pain of that problem on a more regular basis and they'll be much more receptive to a potential solution. The second reason why you want to tackle a problem that people encounter on a more frequent basis is you'll get more chances to know whether your product is actually solving a problem. In my case with Pebble, I loved the fact that I was working on a device that was kind of intended to be used every day. You wake up in the morning, you put your watch on.
Eric Migicovsky
That was great for me because I knew that if they weren't, if users weren't wearing their watch on a regular basis, that meant that I was doing something wrong. The best first customers are ones that have this problem very frequently. The third thing that you want to get to the bottom of is, how large is their budget for solving this problem? You can imagine that say you're solving something for an industrial assembly line, a problem on the industrial assembly line. If you're talking to the operator, the person who's actually there
Eric Migicovsky
on the kind of the assembly line, they may encounter this problem on a really regular basis, but they just don't have the budget, they don't have the authority to actually solve the problem. That's their boss or that's someone above them in the office or in the headquarters. Again, as you're trying to identify the best first customers, make sure that you're asking questions about whether they actually have the ability to solve the problem given the choice. The last stage before product market fit
Eric Migicovsky
that can benefit from user interviews is actually the process of iterating towards product-market fit. Paul Graham's definition for product-market fit is when you've made something that people want. Mark Andreesen also has an amazing blog post about product-market fit where he describes it as when the product is just being pulled out of you, when you no longer have to push the product on customers, they're just pulling it from you. But the problem with these definitions of product-market fit is that they're vague.
Eric Migicovsky
They're also retroactive in that you have to already have product-market fit in order to know that you've reached it. They're not as useful for helping you figure out which features you need to build in order to iterate, in order to improve your product to get to product-market fit. You may have heard of the app Superhuman, which is a super fast email client. Well, the CEO published an amazing blog post a little while ago about how he built a, well, how he was actually annoyed with this vague definition of what product-market fit is
Eric Migicovsky
and how it was a lagging indicator that didn't help him predict product-market fit, it only told him whether he'd achieve it or not. He wanted to create a real-time quantitative system that would help guide his company toward product-market fit. And, of course, it involved talking to users. He wrote a great blog post on this, you can just Google it. I'm just going to kind of touch on it, but I would highly recommend reading the entire thing because it is fantastic. But in it he describes a process
Eric Migicovsky
where on a weekly basis he asks pretty much all his customers, but it doesn't even have to be your entire customer base, it could just be 30, 40 users, a critical question, how would you feel if you could no longer use Superhuman? Three answers, very disappointed, somewhat disappointed, not disappointed. He measured the percentage of users who answered the questions very disappointed. These are the users who most value your product. These are the users who your product has now become a key part of their life,
Eric Migicovsky
it's kind of weaseled their way into their daily habits. He read some analysis that said that if 40% or more of your user base reports that they would be very disappointed if your product went away on a weekly basis, that that's kind of the signal, that's the differentiation point that it says if you get past this point, your product will just grow exponentially. And he evaluated a number of other successful companies and realized that the answer to this question was always around or above 40%. So, again, I probably won't be able
Eric Migicovsky
to go into it too much more in detail, but I would recommend reading this blog post if you're at the stage where you're iterating and you actively have users that you can ask this question of. This can be an immensely useful thing for quantitatively determining whether the features that you worked on in the previous week were actually benefiting or adding to your product-market fit or potentially detracting from it as well. Some other great tips that we found at this stage is kind of a simple hack,
Eric Migicovsky
ask your users for the phone number during sign up. Because oftentimes you'll be looking at the data and you'll be wondering, why is the data showing this particular kind of learning about our customers? And you may be like thinking in aggregate like 20% of people have this problem. Sometimes it helps to just get on the phone and talk to one person who's encountering this problem. I always encourage founders to put contact information, including phone number, which is like a direct connection
Eric Migicovsky
to customers, pretty high up in the user signup flow. Second one is don't design by committee. You can't simply ask your users what features they want. You have to begin to understand whether those features are truly going to help make your product more sticky and more useful. You can do this through kind of the advice that the Superhuman CEO lays out in his blog post, or you could ask other tactical questions like instead of asking, will users be interested in using this new product or this new feature,
Eric Migicovsky
instead say, here's an upgrade flow, if you want this new product, put your credit card, or if you want this new feature, put your credit card information or pay more even before you've actually built out the feature. This could help give you information about whether the feature that you're working on is actually something that the users are going to use. The third thing to do during user interviews at this stage is to remember to discard bad data. Some of the kind of worst bad data that you may encounter is compliments.
Eric Migicovsky
People may say, "Oh, I love the new design," or, "Man, this thing is really useful." You may love that during the course of your user interviews, but they actually are not useful information because it's not specific, it's more of a general statement about your product, and it's not tactical, it's not giving you correct information on what you can change or what you can improve about your product. The second main type of bad data that you may encounter is fluff, these are hypotheticals, these are generic statements.
Eric Migicovsky
Whenever you're in the middle of a user interview and you start getting on to this hypothetical, you know, oh, here's what the product may look like in the future, try to steer it back to the specifics. Again, you're conducting a user interview not to pitch your product but to learn about problems or issues that the user has faced in their past so that you can improve it in the future.
Craig Cannon
All right, thanks for listening. As always, you can find the transcript and the video at blog.ycombinator.com. If you have a second, it would be awesome to give us a rating and review wherever you find your podcasts.