Friday, September 7, 2018

Data Analysis Template

This is just a quick blog to share my jupyter notebook analysis template.  I analyze a lot of different datasets in a short period, so having the analysis consistent is very helpful.  I'll walk through the sections quickly to share a bit about my process.

Title Section

In the title section, I have a block for any ideas to explore, specific things I intend to do, anything I need to request to be updated in the data, and any notes about the data.  These are all bulleted text boxes.

This section is VERY helpful for working on multiple datasets.  it's easy to forget what you were going to do or what you've done and the summary up front helps get you back in place.

Preparation

next is preparing the data.  No data comes ready for analysis.  Here I have blocks to read in the data, clean the created dataframe, save it to an R data (Rda) object on disk, and then, the next time I need it, I just load the Rda and skip the cleaning.

Analysis

The analysis section is basically filled with mini experiments.  each chuck is one.  As such, it's important that each have a bit of information in comments at the top of it:

  1. A description of the hypothesis being tested or explored.  Something like "looking at the distribution of the periodicity of events".
  2. Once it's done, describe the results.  Yes, the results should describe the results but you'll thank past you if you write down what you got from the analysis when you did it.  Something like "it looks like the periodicity is bimodal with one mode representing X and another representing Y."
  3. Add a comment with a UUID.  Seriously.  Every. Single. Block.  If it's something interesting you're going to put it in a document or a blog or something.  You want to be able to track it from beginning to end.  (Ours track from the report, through several drafts of the report, through drafts of the sections, to a figures rmarkdown file that generates all the figures, to an exploratory report where we created the original analysis.)  Seriously.  If you like it then you shoulda put a UUID on it.
  4. Now you can actually write the analysis code

Appendixes

This is where I put all of the extra stuff.

Testing

I always have a testing block.  Throughout the analysis, you'll spend a lot time testing stuff to make it work, (or simply looking up things like the dimensions of your data and the column names).  Putting those in a testing block keeps you from coming back later and wondering what the block in your analysis was there for.

Lookups

Sometimes you have big, ugly, lookups.  putting them at the top clogs the Preparation section, so I tend to put them at the bottom.  You'll remember you forgot to run them when your analysis fails.

Backup

Really a parking lot for anything you don't want in another section, but don't want to delete.


Ultimately, if I were doing full modeling, I'd probably want a template that follows the process outlined in Modern Dive.  However, for someone just getting into analysis, hopefully this helps!

46 comments:

  1. An all around complex information structure is utilized when considering huge measure of information where as a basic information structure is viewed as enough if the client information is little. ExcelR Data Science Courses

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hey, thanks for this great article I really like this post and I love your blog and also Check data science course Data-Analytics course

    ReplyDelete
  4. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
    Data science Interview Questions
    Data Science Course

    ReplyDelete
  5. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.

    data science course

    ReplyDelete
  6. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries. keep it up.
    data analytics course in Bangalore

    ReplyDelete
  7. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  8. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  9. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  10. It has fully emerged to crown Singapore's southern shores and undoubtedly placed her on the global map of residential landmarks. I still scored the more points than I ever have in a season for GS. I think you would be hard pressed to find somebody with the same consistency I have had over the years so I am happy with that.
    data analytics courses

    ReplyDelete
  11. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.data science training in Hyderabad

    ReplyDelete
  12. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple Linear Regression
    data science interview questions
    KNN Algorithm
    Logistic Regression explained

    ReplyDelete
  13. Very nice blogs!!! i have to learning for lot of information for this sites…Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing, data sciecne course in hyderabad

    ReplyDelete
  14. very well explained .I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Simple Linear Regression
    Correlation vs covariance
    data science interview questions
    KNN Algorithm
    Logistic Regression explained

    ReplyDelete
  15. Highly appreciable regarding the uniqueness of the content. This perhaps makes the readers feels excited to get stick to the subject. Certainly, the learners would thank the blogger to come up with the innovative content which keeps the readers to be up to date to stand by the competition. Once again nice blog keep it up and keep sharing the content as always.
    Data Science Training

    ReplyDelete
  16. Fantastic blog extremely good well enjoyed with the incredible informative content which surely activates the learners to gain the enough knowledge. Which in turn makes the readers to explore themselves and involve deeply in to the subject. Wish you to dispatch the similar content successively in future as well.

    Data Science training in Raipur

    ReplyDelete
  17. Thanks for posting the best information and the blog is very helpful.data science interview questions and answers

    ReplyDelete
  18. The content is well acknowledged, so no one could allege that it is just one person's opinion yet it covers and justifies all the applicable points. I have read such a startling work after a long time!
    Data Science Training in Hyderabad
    Data Science Course in Hyderabad

    ReplyDelete
  19. Really wonderful blog completely enjoyed reading and learning to gain the vast knowledge. Eventually, this blog helps in developing certain skills which in turn helpful in implementing those skills. Thanking the blogger for delivering such a beautiful content and keep posting the contents in upcoming days.

    data science certification in bangalore

    ReplyDelete
  20. Hey, great blog, but I don’t understand how to add your site in my rss reader. Can you Help me please?
    data scientist training and placement in hyderabad

    ReplyDelete
  21. I really enjoyed this blog. It's an informative topic. It helps me very much to solve some problems. Its opportunities are so fantastic and the working style so speedy.
    data scientist training and placement in hyderabad

    ReplyDelete
  22. Thanks for posting the best information and the blog is very important.artificial intelligence course in hyderabad

    ReplyDelete
  23. Thanks for posting the best information and the blog is very good.data science institutes in hyderabad

    ReplyDelete
  24. I was just examining through the web looking for certain information and ran over your blog.It shows how well you understand this subject. Bookmarked this page, will return for extra. data science course in vadodara

    ReplyDelete
  25. Impressive blog to be honest definitely this post will inspire many more upcoming aspirants. Eventually, this makes the participants to experience and innovate themselves through knowledge wise by visiting this kind of a blog. Once again excellent job keep inspiring with your cool stuff.

    Data Science Training in Bhilai

    ReplyDelete
  26. Wonderful blog found to be very impressive to come across such an awesome blog. I should really appreciate the blogger for the efforts they have put in to develop such an amazing content for all the curious readers who are very keen of being updated across every corner. Ultimately, this is an awesome experience for the readers. Anyways, thanks a lot and keep sharing the content in future too.

    Data Science Course in Bhilai

    ReplyDelete
  27. Extraordinary blog went amazed with the content that they have developed in a very descriptive manner. This type of content surely ensures the participants to explore themselves. Hope you deliver the same near the future as well. Gratitude to the blogger for the efforts.

    Data Science Training

    ReplyDelete
  28. Stupendous blog huge applause to the blogger and hoping you to come up with such an extraordinary content in future. Surely, this post will inspire many aspirants who are very keen in gaining the knowledge. Expecting many more contents with lot more curiosity further.

    Data Science Certification in Bhilai

    ReplyDelete
  29. Stupendous blog huge applause to the blogger and hoping you to come up with such an extraordinary content in future. Surely, this post will inspire many aspirants who are very keen in gaining the knowledge. Expecting many more contents with lot more curiosity further.

    Data Science Certification in Bhilai

    ReplyDelete
  30. You completed certain reliable points there. I did a search on the subject and found nearly all people will agree with your blog.
    data science classes in hyderabad

    ReplyDelete

  31. Enroll yourself in the Data Science training online program and reach the epitome of success. We provide a world-class curriculum framed by top industry experts.
    business analytics training in hyderabad

    ReplyDelete
  32. あなたのライティングスキルは素晴らしかったです。記事をポイントごとに簡単に説明していただきました。本当に役に立ちました。貴重な記事を共有していただきありがとうございます。

    インスタグラムリールのダウンロード

    ReplyDelete