For our group project , we have chosen to analyze on dataset “WEB 3”.Would you recommend us to continue with the same dataset.
Motivation: Web2 (2005 – 2020) is a centralized platform in which services are run by a few companies like internet gatekeepers such as apple, google. Although developers were benefitted initially later it become tough for them to survive due to the centralized platform.
Examples include – Epic vs Apple, Facebook vs Zynga, Google vs Yelp (Chris Dixon, 2021, November 4)
So web3 is a decentralized platform with advanced features of web2 but owned by developers and users.
Rationale: In web3, ownership and control are decentralized. Users and builders can own pieces of internet services by owning tokens, both non-fungible (NFTs) and fungible. Tokens give them the property rights to own a piece of the internet.
By having tokens owned by many users they can work towards a common goal which is the growth of community and appreciation of the token.
Business Implications: Issuing native assets, Holding the native asset-building network, building the financial infrastructure of these native assets, payment tokens, work tokens, burn tokens (Mersch, M. (2019, May 4).)
So, in our project, We would like to do sentiment analysis on web3 using Twitter data.
Chris Dixon is a general partner at a16z. (2021, November 4). Why web3 matters. Future. Retrieved November 11, 2021, from https://future.a16z.com/why-web3-matters/.
Mersch, M. (2019, May 4). Which new business models will be unleashed by web 3.0? Medium. Retrieved November 11, 2021, from https://medium.com/fabric-ventures/which-new-business-models-will-be-unleashed-by-web-3-0-4e67c17dbd10.
You will apply your knowledge and skills of Python programming and business analytics
to organize and analyze real-life data for actionable insights. Following points enumerate
key points to include in your project. These points are not exhaustive. Your group may
decide to add content to these points. If your group identifies that some of these points
are not applicable for your project, consult the instructor.
2.1 Identify dataset
Each group will identify a dataset. I recommend identifying multiple datasets, brainstorm
possible questions you may ask, and possible insights. If desired, you could discuss these
options with the instructor. Data can come from different sources:
• Directly from companies, organizations or people that you know. Before using such
datasets, you will need written permission to share the results with the instructor.
In certain cases, the instructor may require a review of the original dataset.
• Datasets available online from organizations, government agencies or universities,
etc. Browse websites such as UCI, kdnuggets.com and Kaggle.com (or any other
online data source you are aware of) for available datasets.
• Collected by you. You may get data from the web.
2.2 Identify research questions
Broadly, your project should include descriptive and predictive questions. Descriptive
questions describe the data. Examples include: what is the average number of downloads
for an App in Google Play Store? Who is the best salesman in the Northeast region? How
does the price change over the years? Is housing price correlated with zip code? You can
usually answer them with summary statistics or graphs. Such questions are part of the
team’s data exploration. You can have plenty of descriptive questions to understand your
data. You may present the most interesting ones in your report or presentation.
Predictive questions intend to predict variable outcomes based on data. Examples
include: what is the prediction for next month’s sale? Is the customer going to default on
their loan? What might be the price for this house? What is the risk of the patient getting
readmitted? You will need to build predictive models to answer these questions. Each
project should have at least one predictive question.
2.3.1 Explore Data
Examine the data. You may want to find about:
• Are there quality issues in the dataset (noisy, missing data, inconsistent, etc.)?
• What will you need to do to clean and/or transform the raw data for analysis?
• Explore each variable and the relationships between variables (Graphs and
2.3.2 Model Analysis
• Build predictive models: you may want to try a variety of methods.
• Evaluate the model performance.