What patterns can be found when looking at different aspects of the data features?

Create and presenting data analysis


Analyze the data gathered for the Center for Disease Control and Prevention (CDC) social vulnerability data and data dictionary (CDC, 2018a; CDC, 2018b), in use for determining the resiliency of communities within specific states: Alabama, Nebraska, and Georgia. Objective: Explore the dataset, considering the state, counties, and population, and four categories:

socioeconomic features, household and composition disability features, and minority status and language limitations, and housing types and transportation. In the interest of clarity, I will specify the variates associated with these categories


o Persons below the poverty estimate

o Civilian unemployed estimate

o Per capita income estimate

o Persons with no high school diploma

Household and composition disability features

o Ages 65 and older

o Ages 17 and under

o Persons with a disability, over the age of 5

o Single-parent households

Minority status and language limitations

o Persons with minority status

o Persons with no or minimal use of the English language

Housing types and transportation

o Multi-unit dwellings (10 or more units)

o Mobile homes

o Homes with more residents than a home is designed for

o Homes with no vehicle

o Group quarters or institutionalized quarters

Note: Do not use the columns that are follow-on calculations of these columns. These are the columns with the prefix “E_”. Consider the following research questions:

How do these factors relate to the measure of social vulnerability (in the data set at RPL_THEMES) metric analytically? By the CDC standards, the closer the value is to one, the higher the vulnerability (CDC, 2018b).

What patterns can be found when looking at different aspects of the data features?

• How do different characteristics of the data relate?

• How well do these variates represent the vulnerability?

• Which characteristics have a more significant influence on predicting vulnerability

Note: Do not repeat the calculations the CDC uses, develop a novel approach. If you use the method the CDC uses, you will earn a zero for the entire assignment.

Data Collection

The data and data dictionaries are online.

o Center for Disease Control and Prevention. (2018a). Social vulnerability index [data set]. https://svi.cdc.gov/Documents/Data/2018_SVI_Data/CSV/SVI2018_US.csv

o Center for Disease Control and Prevention. (2018b). Social vulnerability index [code book]. https://svi.cdc.gov/Documents/Data/2018_SVI_Data/SVI2018Documentation.pdf

o Note: Your raw data must be this report in its original form

. • Create a subset of the data based on the situation and the objective. Note that “E_” are actual measures, while “M_” are the margin of error estimates.

Data Cleaning:

• Review the data for issues.

• Do not transform this data. Do not remove outliers.

• Do not delete the NA values of the data. You may exclude them for certain types of analysis. The data dictionary or code book states how NA values are annotated in the document.

• If there are any erroneous data types, address the issues.

• Look for any other issues that may require cleaning. o Do not automatically remove outliers, remove NA values, or replace NA values. Any of these actions will require justification.

• You may exclude cleaning from the presentation. You MUST include cleaning in your programming.


• Develop a plan and state that plan before extending beyond necessary cleaning.

o Your plan should include what you intend to do in your analysis.

o Your plan shall also include any assumptions or data preparation that must be done for a specific method of analysis.

• Conduct exploratory data analysis, as defined in your plan. This shall include the exploration of multiple different features and how they interrelate.

o The minimum of explorations that are suitable for presenting is five.

• You must include a thorough interpretation of each presented exploration. Do not describe every feature of the table or visualization; interpret critical points and trends. Ensure the investigations combined tell a story about the data. They should not be individual ideas, but concepts that tie together in some manner to bring you to a potential next stage of analysis.

• Any univariate analysis will not count toward the total of five visualizations.

• Develop a new plan and state that plan before extending beyond exploratory data analysis. Your plan shall include a minimum of

o Splitting the data into training and testing sets, with 80% of the data in the training set.

o Develop a random forest model. Explore which independent variables have the most impact on the vulnerability index. Explore the random forest model for the best model, including the number of trees (ntree) and the number of variables for splitting at each tree node (mtry).

o Look at the importance of the different independent variables. What does this tell you about your data and your model?

• Are there any post hoc analyses that may improve your results? Future Recommendations:

• You must also include recommendations for future analysis.

• You will base your recommendations on your findings in the analysis you conduct.

You must generate your presentation in R Markdown

Do not forget to annotate comments in your code. You must include ALL the references you used in APA format in your presentation. If you use a source to assist in writing the programming code in your

Rmd file, include that reference in APA format (no italics or indention required) in a comment in the {r} chunk(s) to which it applies. Required files to submit: You shall submit the Rmd file of your slides and any other files your R Markdown file relies on to knit, by Saturday night at midnight. When you present on Sunday, what you present and what you submit must be identical. Do not submit the raw data file.


Do not forget to reference the source of the data and data dictionary. It is in this document in APA 7. There are 15 predictors and one outcome variate that shall be used in exploratory data analysis and the random forest model.

Calculate your paper price
Pages (550 words)
Approximate price: -

Why Work with Us

Top Quality and Well-Researched Papers

We always make sure that writers follow all your instructions precisely. You can choose your academic level: high school, college/university or professional, and we will assign a writer who has a respective degree.

Professional and Experienced Academic Writers

We have a team of professional writers with experience in academic and business writing. Many are native speakers and able to perform any task for which you need help.

Free Unlimited Revisions

If you think we missed something, send your order for a free revision. You have 10 days to submit the order for review after you have received the final document. You can do this yourself after logging into your personal account or by contacting our support.

Prompt Delivery and 100% Money-Back-Guarantee

All papers are always delivered on time. In case we need more time to master your paper, we may contact you regarding the deadline extension. In case you cannot provide us with more time, a 100% refund is guaranteed.

Original & Confidential

We use several writing tools checks to ensure that all documents you receive are free from plagiarism. Our editors carefully review all quotations in the text. We also promise maximum confidentiality in all of our services.

24/7 Customer Support

Our support agents are available 24 hours a day 7 days a week and committed to providing you with the best customer experience. Get in touch whenever you need any assistance.

Try it now!

Calculate the price of your order

Total price:

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

No need to work on your paper at night. Sleep tight, we will cover your back. We offer all kinds of writing services.


Essay Writing Service

No matter what kind of academic paper you need and how urgent you need it, you are welcome to choose your academic level and the type of your paper at an affordable price. We take care of all your paper needs and give a 24/7 customer care support system.


Admission Essays & Business Writing Help

An admission essay is an essay or other written statement by a candidate, often a potential student enrolling in a college, university, or graduate school. You can be rest assurred that through our service we will write the best admission essay for you.


Editing Support

Our academic writers and editors make the necessary changes to your paper so that it is polished. We also format your document by correctly quoting the sources and creating reference lists in the formats APA, Harvard, MLA, Chicago / Turabian.


Revision Support

If you think your paper could be improved, you can request a review. In this case, your paper will be checked by the writer or assigned to an editor. You can use this option as many times as you see fit. This is free because we want you to be completely satisfied with the service offered.