Friday, September 7, 2018

Data Analysis Template

This is just a quick blog to share my jupyter notebook analysis template.  I analyze a lot of different datasets in a short period, so having the analysis consistent is very helpful.  I'll walk through the sections quickly to share a bit about my process.

Title Section

In the title section, I have a block for any ideas to explore, specific things I intend to do, anything I need to request to be updated in the data, and any notes about the data.  These are all bulleted text boxes.

This section is VERY helpful for working on multiple datasets.  it's easy to forget what you were going to do or what you've done and the summary up front helps get you back in place.

Preparation

next is preparing the data.  No data comes ready for analysis.  Here I have blocks to read in the data, clean the created dataframe, save it to an R data (Rda) object on disk, and then, the next time I need it, I just load the Rda and skip the cleaning.

Analysis

The analysis section is basically filled with mini experiments.  each chuck is one.  As such, it's important that each have a bit of information in comments at the top of it:

  1. A description of the hypothesis being tested or explored.  Something like "looking at the distribution of the periodicity of events".
  2. Once it's done, describe the results.  Yes, the results should describe the results but you'll thank past you if you write down what you got from the analysis when you did it.  Something like "it looks like the periodicity is bimodal with one mode representing X and another representing Y."
  3. Add a comment with a UUID.  Seriously.  Every. Single. Block.  If it's something interesting you're going to put it in a document or a blog or something.  You want to be able to track it from beginning to end.  (Ours track from the report, through several drafts of the report, through drafts of the sections, to a figures rmarkdown file that generates all the figures, to an exploratory report where we created the original analysis.)  Seriously.  If you like it then you shoulda put a UUID on it.
  4. Now you can actually write the analysis code

Appendixes

This is where I put all of the extra stuff.

Testing

I always have a testing block.  Throughout the analysis, you'll spend a lot time testing stuff to make it work, (or simply looking up things like the dimensions of your data and the column names).  Putting those in a testing block keeps you from coming back later and wondering what the block in your analysis was there for.

Lookups

Sometimes you have big, ugly, lookups.  putting them at the top clogs the Preparation section, so I tend to put them at the bottom.  You'll remember you forgot to run them when your analysis fails.

Backup

Really a parking lot for anything you don't want in another section, but don't want to delete.


Ultimately, if I were doing full modeling, I'd probably want a template that follows the process outlined in Modern Dive.  However, for someone just getting into analysis, hopefully this helps!

33 comments:

  1. An all around complex information structure is utilized when considering huge measure of information where as a basic information structure is viewed as enough if the client information is little. ExcelR Data Science Courses

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Actually I read it yesterday but I had some thoughts about it and today I wanted to read it again because it is very well written.

    business analytics course

    data analytics courses

    data science interview questions

    data science course in mumbai

    ReplyDelete
  4. Hey, thanks for this great article I really like this post and I love your blog and also Check data science course Data-Analytics course

    ReplyDelete
  5. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
    Data science Interview Questions
    Data Science Course

    ReplyDelete
  6. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.

    data science course

    ReplyDelete
  7. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries. keep it up.
    data analytics course in Bangalore

    ReplyDelete
  8. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression

    ReplyDelete
  9. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  10. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  11. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  12. Thanks for sharing great information. I like your blog and highly recommend. We also offer best data science training in Hyderabaddata scientist courses

    ReplyDelete
  13. It has fully emerged to crown Singapore's southern shores and undoubtedly placed her on the global map of residential landmarks. I still scored the more points than I ever have in a season for GS. I think you would be hard pressed to find somebody with the same consistency I have had over the years so I am happy with that.
    data analytics courses

    ReplyDelete
  14. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Simple Linear Regression
    Correlation vs covariance
    data science interview questions
    KNN Algorithm
    Logistic Regression explained

    ReplyDelete
  15. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.data science training in Hyderabad

    ReplyDelete
  16. I would like to thank you for getting my neurons conspicuous with this brilliant article that you have written which contains every potential points which needs to considered on the given topic. Thanks for chipping in such a brilliant writing!
    Data Science training in Mumbai
    Data Science course in Mumbai
    SAP training in Mumbai

    ReplyDelete
  17. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple Linear Regression
    data science interview questions
    KNN Algorithm
    Logistic Regression explained

    ReplyDelete
  18. Very nice blogs!!! i have to learning for lot of information for this sites…Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing, data sciecne course in hyderabad

    ReplyDelete
  19. very well explained .I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Simple Linear Regression
    Correlation vs covariance
    data science interview questions
    KNN Algorithm
    Logistic Regression explained

    ReplyDelete
  20. Highly appreciable regarding the uniqueness of the content. This perhaps makes the readers feels excited to get stick to the subject. Certainly, the learners would thank the blogger to come up with the innovative content which keeps the readers to be up to date to stand by the competition. Once again nice blog keep it up and keep sharing the content as always.
    Data Science Training

    ReplyDelete
  21. Highly appreciable regarding the uniqueness of the content. This perhaps makes the readers feels excited to get stick to the subject. Certainly, the learners would thank the blogger to come up with the innovative content which keeps the readers to be up to date to stand by the competition. Once again nice blog keep it up and keep sharing the content as always.

    Data Science Course in Bhilai

    ReplyDelete
  22. Fantastic blog extremely good well enjoyed with the incredible informative content which surely activates the learners to gain the enough knowledge. Which in turn makes the readers to explore themselves and involve deeply in to the subject. Wish you to dispatch the similar content successively in future as well.

    Data Science training in Raipur

    ReplyDelete
  23. Thanks for posting the best information and the blog is very helpful.data science interview questions and answers

    ReplyDelete
  24. The content is well acknowledged, so no one could allege that it is just one person's opinion yet it covers and justifies all the applicable points. I have read such a startling work after a long time!
    Data Science Training in Hyderabad
    Data Science Course in Hyderabad

    ReplyDelete
  25. Really wonderful blog completely enjoyed reading and learning to gain the vast knowledge. Eventually, this blog helps in developing certain skills which in turn helpful in implementing those skills. Thanking the blogger for delivering such a beautiful content and keep posting the contents in upcoming days.

    data science certification in bangalore

    ReplyDelete
  26. Impressive blog to be honest definitely this post will inspire many more upcoming aspirants. Eventually, this makes the participants to experience and innovate themselves through knowledge wise by visiting this kind of a blog. Once again excellent job keep inspiring with your cool stuff.

    data science certification in bangalore

    ReplyDelete
  27. Hey, great blog, but I don’t understand how to add your site in my rss reader. Can you Help me please?
    data scientist training and placement in hyderabad

    ReplyDelete