30 Commits

Author SHA1 Message Date
2fd699497f Rewrote explanation for dropped rows.
Because the function about excluded rows was changed to show a different
key.
2025-04-30 04:06:51 -07:00
08ab9f126c Expanded probe_excluded_rows().
Shows blue bars to represent total professional developers, and red bar
to represent those included in the analysis.
x-axis is years of professional experience (changed from age).
2025-04-30 03:29:07 -07:00
d5443bd1fb Probe into developers excluded from analysis.
Added charts on participants who did not specify an annual income
compared with those who did.
Can print quantity of rows with NaN dropped.
2025-04-29 08:13:45 -07:00
2af2414219 Write DOCSTRINGS for functions.
Also corrected typo in label.
2025-04-29 08:13:26 -07:00
f49283a7cc Added train-test splitting for log regression. 2025-04-28 06:31:35 -07:00
1f7fe33915 Added function show_model_stats(). 2025-04-27 16:23:33 -07:00
3c3e804251 Simplified code. Removed to_frame(). 2025-04-27 12:42:39 -07:00
67d1441303 Added logarithmic regression
Not part of the course but fits better.
2025-04-27 10:49:37 -07:00
b18a5cb42a Implemented "risky" (pink) model.
Also cleaned up code.
Training on sorted data is unrecommended and "risky";
however, the risky model appears to be generalizing
across random state.
2025-04-27 10:06:33 -07:00
311db886a4 Minor fixes to README.
Corrected typo of 'developers'.
Removed bulletpoint for the only point under acknowledgements.
2025-04-25 01:47:21 -07:00
7b34548a2d Comply with PEP 8.
For consistent quotes and number of line breaks.
At least for the code that is being used.
2025-04-25 01:47:17 -07:00
fbdd5f3f18 Added cells for CRISP_DM.
Instructor said it was a nonlinear process,
but the grader wants it to be linear.
2025-04-25 01:13:45 -07:00
0a0281ab4e Updated title for graphs. 2025-04-24 01:45:08 -07:00
0f248e6b9a Wrote README. 2025-04-24 01:43:05 -07:00
7d82e4c588 Print only 2 significant figures of regression results. 2025-04-23 18:42:37 -07:00
721a38435b Repurposed horizontal line to show the y-intercept value. 2025-04-23 07:16:48 -07:00
bdcba003fe Preparing notebook for submission.
Added business understanding Q&A.
Labeled outputs by color (regression attempt).
Some code clean up.
2025-04-23 07:00:38 -07:00
320fcb343b Moved cells from data exploration after regressions. 2025-04-22 00:18:55 -07:00
a6bc83fa8b Started performing linear or log. regressions.
Split trails for different languages into their own cells.
2025-04-21 23:22:00 -07:00
f073019538 Created function to generate chart of salary over years of exp. 2025-04-20 22:00:57 -07:00
08eb095bf6 Added chart for years of experience and earnings.
Can select developers by programming language.
Colorize dots by country, employment status.
2025-04-20 20:32:27 -07:00
2d91e205a2 Came up with better names for axes (language comparison). 2025-04-20 07:47:29 -07:00
71d1efa292 Changed get_differences to result in a ratio. 2025-04-20 06:07:55 -07:00
c6096cfe6c Added new scatter plot.
Shows the difference between usage and desire to use a lang (=y)
over the usage of language (=x).
2025-04-19 20:03:19 -07:00
ea4ee3f493 Check against people who weren't paying attention. 2025-04-19 15:41:31 -07:00
0c5cc2259d Made all graphs to have tight bounding boxes. 2025-04-19 14:58:09 -07:00
cbd575697f Squashed commit of the following:
commit e1691bb85b611c84ae9e4315523de1b79837ef2b
Author: scuti <scuti@tutamail.com>
Date:   Sat Apr 19 14:00:28 2025 -0700

    Created graph for job title and compensation

commit 50e00a42686f7135508ca08d1354a36012e839d7
Author: scuti <scuti@tutamail.com>
Date:   Sat Apr 19 06:38:16 2025 -0700

    Got visualization idea for annual compensation
2025-04-19 14:10:44 -07:00
e4ba004fec Added param. to save graphs to file. 2025-04-19 10:23:42 -07:00
59857f8d36 Showed difference in admired and desired languages.
Visualizations for "have worked with" and 'want to work with".
2025-04-18 19:07:54 -07:00
-
be0ee359d4 Initial commit.
Exploring the popularity of programming languages
2025-04-18 16:02:25 -07:00