Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating Kedro-Viz adoption #987

Closed
yetudada opened this issue Jul 27, 2022 · 4 comments
Closed

Evaluating Kedro-Viz adoption #987

yetudada opened this issue Jul 27, 2022 · 4 comments

Comments

@yetudada
Copy link
Contributor

Introduction

The premise of this research study is to unpack why about 20% of the Kedro user base is a user of Kedro-Viz so that we can understand what it would take to increase that number to 50%.

The research methods will include interviews and surveys to identify and understand the following user groups:

  • Primary users of Kedro-Viz and the value they derive from it
  • Users that rarely use Kedro-Viz and the reasons this happens
  • Users that have never used Kedro-Viz

Note: Some of the qualitative data is being sourced in kedro-org/kedro#1653 from the Databricks workflow evaluation.

Supporting data

Screenshot 2022-07-27 at 13 53 12

Link to chart

Outcome

This work will inform the Kedro-Viz roadmap for 2022/2023 as we will only consider adoption relative to Kedro adoption.

@yetudada yetudada added this to Roadmap Jul 27, 2022
@yetudada yetudada moved this to Later - Discovery or Research in Roadmap Jul 27, 2022
@tynandebold
Copy link
Member

tynandebold commented Aug 1, 2022

Outcome for this ticket

  1. Design the research, e.g. what questions are we going to be asking our users.
  2. Find the users and list them here.

@tynandebold tynandebold moved this to Todo in Kedro-Viz Aug 1, 2022
@tynandebold tynandebold moved this from Todo to In Progress in Kedro-Viz Aug 9, 2022
@yetudada yetudada moved this from Discovery or Research - Later 🧪 to Discovery or Research - Now ⏳ in Roadmap Aug 22, 2022
@NeroOkwa
Copy link
Contributor

NeroOkwa commented Sep 1, 2022

Hypothesis

Users

I believe there are 2 types of Kedro-Viz users: 

  1. Technical users - Who are comfortable using the command line interface(CLI), and only use Kedro-Viz when they need to share their work, and not during development. For these users, Kedro-Viz is a nice-to-have.
  2. Non-Technical users - Who are not as comfortable using the CLI. Actively use Kedro-Viz to follow 'what's going on' and inspect the code. For these users, Kedro-Viz is a must-have.

Assumptions

Based on these 2 user groups, I propose the following hypotheses for low adoption of Kedro-Viz:

  • Technical users  don't need to use Kedro-Viz  as it is a nice-to-have 
  • Most Kedro(and hence Kedro-Viz) users are Technical users 
  • Users don't know about Kedro-Viz
  • Users don't know how to use Kedro-Viz
  • Users have tried Kedro-Viz and did not like it 
  • Users have a better alternative to Kedro-Viz
  • Technical users don't need Kedro-Viz for pipeline visualisation (due to extended functionality in the CLI), but need it for Experiment Tracking. The more users know about Experiment Tracking the more users would use Kedro-Viz

Based on these assumptions, kedro users were polled on slack and discord. 

Poll Data

Screenshot 2022-09-01 at 11 03 00

Screenshot 2022-09-01 at 11 03 31

The polls show that over 90% of respondents have used Kedro-Viz on their last three projects. 

The next step is to send a recruiting email to respondents, inviting them to participate in subsequent  user interviews.

Below is a copy of this email.

Recruiting email

Introduction

We've shipped several releases of Kedro-Viz and we would like to understand why and how the feature is used.
Our goal is to improve user adoption.

How can you help?

We're conducting user interviews to understand user pain points.

We will ask a series of questions during the interview and will need help understanding: 

  • The reason why you chose to use(or not use) Kedro-Viz;
  • Your chosen workflow in the last project that you worked on;
  • Pain points, errors, or workarounds you ran into - it is helpful if you can provide screenshots or recordings;
  • Potential improvements for Kedro-Viz.

Why did you get this invite?

You are receiving this invite because you either responded to our user poll, or we observed questions or comments about Kedro-Viz on our support channel.

@NeroOkwa
Copy link
Contributor

NeroOkwa commented Sep 1, 2022

Interview questions

For Option A responses - 'Yes I used it on any of my last three projects'

Introduction

  1. Who are you? What do you do at your company?
  2. Have you used Kedro-Viz in your last three projects?
  3. What version of Kedro and Kedro-Viz did you use?
  4. Can you describe the last project that you used Kedro-Viz on?
  5. When last did you use Kedro-Viz?
  6. What did you use  Kedro-Viz for?  

Workflow

  1. Can you describe what steps you took to use Kedro-Viz for this project?
  2. How would you describe the experience when using Kedro-Viz?
  3. What did you like about Kedro-Viz during that project?
  4. Did you run into any challenges or pain points while using Kedro-Viz for this project?
  5. How did you solve these problems?
  6. Did you read the documentation at any point and how useful was it?

Conclusion

  1. What should we do to improve Kedro-Viz?
  2. If we improve  Kedro-Viz, based on your recommendation would you use it in the future?
  3. Is there anything else you would like to tell us?

For Option D responses - 'No, I prefer using other tools'

Introduction

  1. Who are you? What do you do at your company?
  2. Have you used Kedro-Viz in your last three projects?
  3. Why aren't you currently using Kedro-Viz? 
  4. What are you using instead of Kedro-Viz?

Workflow

  1. Can you describe your workflow using this other tool for this project?
  2. What are your biggest pain points and how does this other tool(s) help you solve that?

Conclusion

  1. If we improve Kedro-Viz (feature parity) would you use it in the future?
  2. Is there anything else you would like to tell us on why you are not using Kedro-Viz? 

@tynandebold tynandebold moved this from In Progress to Backlog in Kedro-Viz Sep 26, 2022
@tynandebold tynandebold moved this from Backlog to In Progress in Kedro-Viz Sep 26, 2022
@NeroOkwa NeroOkwa moved this from In Progress to In Review in Kedro-Viz Oct 24, 2022
@NeroOkwa NeroOkwa moved this from In Review to Done in Kedro-Viz Oct 24, 2022
@NeroOkwa
Copy link
Contributor

NeroOkwa commented Nov 25, 2022

Kedro-Viz Adoption Synthesis

Goal and Methodology

The goal was to understand the low adoption of Kedro-Viz by existing Kedro users. #987.

The research used a qualitative (interview 🎤 - 8 participants) and quantitative (polls 🗳️) approach across the QuantumBlack and open-source user bases.

Hypothesis

  1. Technical users don't need to use Kedro-Viz as it is a nice-to-have - False
  2. Most Kedro(and hence Kedro-Viz) users are Technical users - True
  3. Users don't know about Kedro-Viz - False
  4. Users don't know how to use Kedro-Viz -False
  5. Users have tried Kedro-Viz and did not like it - Not tested
  6. Users have a better alternative to Kedro-Viz - False
  7. Technical users don't need Kedro-Viz for pipeline visualisation (due to extended functionality in the CLI), but need it for Experiment Tracking. The more users know about Experiment Tracking the more users would use Kedro-Viz - False

1 - Kedro-Viz Use Case

Summary: All 8 users chose to use Kedro-Viz (pipeline flowchart) for stakeholder communication and onboarding new non-technical(non-IDE) members of their team. 1 user used Kedro-Viz for debugging.

  • “It's a very good way for you to visualize your pipeline in the way that, how the node interact with the datasets and it's a very good way learn how its being built”.
  • “CSTs use it as well to understand the pipeline itself because the idea is that this pipeline or this output of the product will directly go through CSTs. So they need to be able to understand what's going on so they can adapt it faster. We are also creating documentation for the pipelines using Kedro-Viz”
  • “The second thing is like onboarding of new project members. Like exploring kedro projects through Viz, especially since you can also dive into the code helps accelerate onboarding time dramatically”.
  • “Seeing how the data is flowing through step by step by step, allowed us to understand where we had things like unstable data dependencies, under-utilized data dependencies, and upstream/ downstream conflicts. And really helped us make it a robust and efficient process”.

2 - Pain points

Pain point 1 - Long load/auto-reload time (especially for bigger projects) - 2/8 users

  • “I would want to pull it up and it, and I don't know if this has been resolved since then, but it would take quite a substantial amount of time, to be able to pull up and the auto reload didn't always work as quickly or as responsibly as if you wanted to see something in real time... It felt slow, like we were waiting a little too much”.
  • “The startup times on bigger projects can be, sometimes when you're in a meeting and you want to bring it up quickly, it takes some time to load, but nothing critical”
    • “The subjective perception is is that for projects with multiple pipelines and multiple modular pipelines, it feels longer than for smaller, like for space light for example”.

Pain point 2 - Circular dependency #839 and #1105- 1 user

  • “Sometimes on the layer level we have, we run into this that the layer can't be activated because of circular dependencies. I think there is an open issue for that, but that has been (especially in like big projects) a pain points sometimes because the graph that you then end up with is really much harder to navigate as opposed to with the layers that, that we created in the catalogs. And yeah, just being able to debug which data set is causing the issue would already be incredibly helpful”.

Pain point 3 - Port access issue - 1 user

  • “I think it requires you to like open a new port and then do Kedro. If you do it on the existing port, I'm not sure if this is something that Kedro-Viz team can handle or it's like something like related to the network and the IPS and stuff. I'm not very qualified to answer that. But what we had to do is basically we had to make sure that that data and our pipeline is on different port and the Kedro-Viz on a different port because it used to override basically”.

3 - Potential Improvements

Kedro-Viz/ Sphinx documentation workflow - 1 team of users

  • “It would be nice to embed Kedro-viz in a sphinx documentation or something like that. So for example, in the current project, we are building sphinx documentation, having a nice hosting on GitLab documentation. It would be nice to have a page there where the user can go and just see and interact with the pipeline”.

Pandas Profiling - 1 team of users

  • “So around sort of profiling datasets and I think that it would be quite powerful if you could link profiling into the Kedro-Viz as well so I could actually see what's in those datasets”
  • "I think Kedro-Viz kind gives this very nice high level of what the pipeline looks like and then I think being able to then sort of going a into detail and being able to see what's actually in those data sets would be very powerful"

4 - Proposed Feature

Low code/Miro - like features - 2 users

  • It would be helpful from a user perspective who doesn't know nothing about this. If for example, a customer is seeing the pipeline, A client, a client product manager, or a client finance guy wants to see where I'm using the finance data sets from. So they don't need to go through the whole pipeline. They can simply scroll through the finance data sets, click on that.
  • “I think that it might become a really good platform for low code development of data science pipelines and so on because I think that generating the catalogs entry directly from kedro or generating the parameters that are used in kedro directly from the flow chart. So simply clicking and saying okay I want to have an input here, is something that could become really useful.”

Icons/tags for dataset filtering #480- 2 users

  • “Providing additional tags, especially because we could then use that within our data life cycle management and Azure in order to kind of get rid of intermediate data sets in experiment runs that we don't need anymore”.
  • “Ideally then we could also filter on certain dataset tags in the kedro visualisation to just see where do we have a type of information coming from, and which pipelines are affected by that. So when we look at like the raw data layer and we want to just see data pipelines that use datasets that are tagged in a specific way. That could be incredibly useful”.
    • “So what I'm wondering here is, and that might be missing something if these tags are coming from the nodes and the data sets inherit them. I know that that works, but I'm more thinking about the other way around, I don't know if you can tag datasets and then the nodes inherit from that”.
    • “That would allow the data sourcing team to then tag in the catalog, and then the data scientists then understands better where certain datasets with these tags flow”.

Sorting modular pipelines in order/non-alphabetically - 1 user

  • “Not alphabetically sorting the modular pipelines and having it in order”. “Like it would be good to have the order of the pipeline seem the same order that you see them in the diagrams”.

Troubleshooting with dependencies - 1 user

  • “So there were times when Kedro-Viz, obviously it needs all of the dependencies to be working correctly in order to generate the, the visualisation. At times it felt like it would be valuable if those dependencies weren't accurate. If those pieces could be orphaned as almost like a troubleshooting where the pipeline might be going wrong. It's more nice to have”
    • Value to workflow - “To track down upstream and downstream the missing dependencies or where something might be broken in a pipeline to make troubleshooting easier using DAAGs instead of IDs or whatever to do that”.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Status: Shipped 🚀
Development

No branches or pull requests

3 participants