How to write A Data Science Blog Post

Prishita Kapoor
3 min readAug 29, 2021

Code Functionality and Readability

Criteria: Code is readable (uses good coding practices — PEP8)

Meet Specification: Code has easy-to-follow logical structure. The code uses comments effectively and/or Notebook Markdown cells correctly. The steps of the data science process (gather, assess, clean, analyze, model, visualize) are clearly identified with comments or Markdown cells, as well. The naming for variables and functions should be according to PEP8 style guide.

Criteria: Code is functional.

Meet Specification: All the project code is contained in a Jupyter notebook or script. If you use a notebook, it shows a demonstration of successful execution and output of the code.

Criteria: Write code that is well documented and uses functions and classes as necessary.

Meet Specification: Code is well documented and uses functions and classes as necessary. All functions include document strings. DRY principles are implemented.

Data

Criteria: Project follows the CRISP-DM Process while analyzing their data.

Meet Specification: Project follows the CRISP-DM process outlined for questions through communication. This can be done in the README, the notebook, or in a script. If a question does not require machine learning, descriptive or inferential statistics should be used to create a compelling answer to a particular question.

Criteria: Proper handling of categorical and missing values in the dataset.

Meet Specification: Categorical variables are handled appropriately for machine learning models (if models are created). Missing values are also handled appropriately for both descriptive and ML techniques. Document why a particular approach was used, and why it was appropriate for a particular situation.

GitHub Repository

Criteria: Student must publish their code in a public GitHub repository.

Meet Specification: Student must have a GitHub repository of their project. The repository must have a README.md file that communicates the libraries used, the motivation for the project, the files in the repository with a small description of each, a summary of the results of the analysis, and necessary acknowledgements. Students should not use another student’s code to complete the project, but they may use other references on the web including Stack Overflow and Kaggle to complete the project.

Blog Post

Criteria: Communicate their findings with stakeholders.

Meet Specification: Student must have a blog post on a platform of their own choice (can be on their website, a Medium post or Github blog post). Student must communicate their results clearly. The post should not dive into technical details or difficulties of the analysis — this should be saved for Github. The post should be understandable for non-technical people from many fields.

Criteria: There should be an intriguing title and image related to the project.

Meet Specification: Student must have a title and image to draw readers to their post.

Criteria: The body of the post has paragraphs that are broken up by appropriate white space and images.

Meet Specification: There are no long, ongoing blocks of text without line breaks or images for separation anywhere in the post.

Criteria: Each question has a clearly communicated solution.

Meet Specification: Each question is answered with a clear visual, table, or statistic that provides how the data supports or disagrees with some hypothesis that could be formed by each question of interest.

--

--

Prishita Kapoor

I endeavor to fuse academic knowledge with experience for the growth of the world while enhancing my engineering and interpersonal skills.