<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>The Friction Point</title>
<link>https://www.jrwinget.com/blog.html</link>
<atom:link href="https://www.jrwinget.com/blog.xml" rel="self" type="application/rss+xml"/>
<description>Understanding where systems fail, and redesigning them through behavioral science and ethical innovation.</description>
<generator>quarto-1.8.26</generator>
<lastBuildDate>Thu, 18 Dec 2025 00:00:00 GMT</lastBuildDate>
<item>
  <title>What Is Cognitive Engineering? Building Technology for Human Minds</title>
  <link>https://www.jrwinget.com/blog/2025-12-18_cognitive-engineering/</link>
  <description><![CDATA[ 




<section id="the-incident-that-should-change-everything" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="the-incident-that-should-change-everything">The Incident That Should Change Everything</h2>
<p>Last month, a routine <a href="https://blog.cloudflare.com/18-november-2025-outage/">Cloudflare configuration update</a> took down a large portion of the internet. ChatGPT, Uber, government websites, and countless downstream services went dark. The cause was mundane: a Bot Management configuration file grew too large and triggered a latent failure mode.</p>
<p>A month earlier, AWS <code>us-east-1</code> suffered a cascading outage when <a href="https://cyberpress.org/amazon-aws-internet-outage/">internal DNS couldn’t resolve DynamoDB endpoints</a>. The circular dependency was documented, but the engineers who understood it weren’t in the room when the architecture was designed, and the monitoring systems were watching the wrong signals.</p>
<p>These incidents weren’t failures of skill, effort, or intent. Talented engineers were doing their jobs under pressure. What failed was something more fundamental: The systems exceeded the cognitive capacity of the humans responsible for operating them safely.</p>
<p>We’ve become remarkably good at optimizing systems for machines while systematically overlooking the cognitive needs of the people who build, maintain, and use them. We measure deployment frequency without tracking decision fatigue. We monitor uptime while cognitive debt accumulates invisibly until systems collapse under its weight.</p>

<div class="no-row-height column-margin column-container"><div class="">
<p><strong>Cognitive debt</strong> accumulates like technical debt: Quietly, compounding, and often invisible until failure.</p>
</div></div><p>Depending how one measures “a decision”, the average developer easily makes hundreds to thousands of cognitive decisions per day. Software development is fundamentally a continuous, complex, and iterative decision-making process. These decisions range widely in scope and impact, from high-level architectural choices to low-level implementation details.</p>
<p>But each decision draws from a finite reservoir of attention and judgment. By mid-afternoon, that reservoir is depleted. By Friday, many teams are operating in deficit. Yet we design systems as if attention were infinite, memory perfect, and judgment immune to degradation.</p>
<p>There’s a better path forward, and it’s grounded in research from cognitive science, organizational psychology, and human factors engineering.</p>
</section>
<section id="what-is-cognitive-engineering" class="level2">
<h2 class="anchored" data-anchor-id="what-is-cognitive-engineering">What Is Cognitive Engineering?</h2>
<p>Cognitive engineering places human cognition at the center of system design. It treats attention, memory, judgment, and coordination as first-order constraints that shape everything from CI/CD pipelines to incident response procedures.</p>
<p>This goes well beyond traditional notions of “user experience”. Cognitive engineering recognizes that every technology exists within a <strong>distributed cognitive system</strong> composed of human minds, software, infrastructure, and teams. Performance emerges from how well these elements support one another.</p>
<div class="callout callout-style-simple callout-note no-icon callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>The Distributed Cognitive System
</div>
</div>
<div class="callout-body-container callout-body">
<p>Every system includes:</p>
<ul>
<li><strong>Human minds</strong> with finite attention and imperfect memory</li>
<li><strong>Software and infrastructure</strong> whose complexity must be understood and recalled</li>
<li><strong>Teams</strong> that process information collectively, with predictable biases</li>
<li><strong>Interfaces</strong> that impose cognitive demands on operators and users</li>
</ul>
<p>When any element fails, the system breaks at its most vulnerable point: People</p>
</div>
</div>
<p>When we neglect the cognitive aspects of this system, the consequences are predictable. Developers burn out. Projects fail despite technical correctness. Systems function in theory but falter in practice.</p>
<p>Organizations that take cognition seriously see measurable results. When Microsoft’s dev division <a href="https://www.microsoft.com/en-us/research/publication/the-influence-of-organizational-structure-on-software-quality/">implemented cognitive load budgeting</a>, production incidents dropped by nearly half. When Etsy <a href="https://www.simform.com/blog/etsy-devops-case-study">redesigned their deployment pipeline around cognitive principles</a>, deployment frequency increased while errors fell dramatically. Supporting developer cognition and improving business outcomes turn out to be the same work.</p>
</section>
<section id="cognition-lives-everywhere-and-were-bad-at-seeing-it" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="cognition-lives-everywhere-and-were-bad-at-seeing-it">Cognition Lives Everywhere (And We’re Bad at Seeing It)</h2>
<section id="your-infrastructure-has-a-psychology-problem" class="level3 page-columns page-full">
<h3 class="anchored" data-anchor-id="your-infrastructure-has-a-psychology-problem">Your Infrastructure Has a Psychology Problem</h3>
<p>Ask any SRE what causes most outages and you’ll often hear “human error”. While this diagnosis might technically be true, it obscures a deeper truth. What we label as human error is frequently the consequence of systems that demand more cognitive resources than people can reliably provide.</p>

<div class="no-row-height column-margin column-container"><div class="margin-aside">
<p>What we call <strong>human error</strong> is often system design wearing a clever disguise.</p>
</div></div><p>In the Cloudflare incident, the system provided little cognitive support. There were no escalating warnings as configuration size approached limits. No validation that forced a pause. No accessible mental model of how configuration changes propagated through the proxy layer. The operator was left to notice something the system could’ve surfaced automatically.</p>
<p>The AWS incident followed similar logic. The circular dependency existed in documentation, but documentation and working memory are very different things. During an incident (especially urgent high-stakes ones), people can only act on what they can actively hold in mind.</p>
<p>The <a href="https://sre.google/sre-book/eliminating-toil/">Google SRE handbook</a> addresses part of this problem through the concept of eliminating toil. But toil isn’t just repetitive work. It includes anything that unnecessarily drains cognitive resources: dashboards that require mental gymnastics, alerts that cry wolf, runbooks that assume perfect recall under stress, or deployment processes that require holding multiple interdependent concepts in mind at once.</p>
</section>
<section id="developers-are-users-too" class="level3">
<h3 class="anchored" data-anchor-id="developers-are-users-too">Developers Are Users Too</h3>
<p>Developer environments impose cognitive demands just as real as those faced by end users. Research by <a href="https://www.drcathicks.com/developer-thriving">Hicks, Lee, and Ramsey</a> on developer thriving shows when cognitive work goes unrecognized, people disengage and leave.</p>
<p>They identify four factors that shape whether developers thrive:</p>
<ul>
<li><strong>Learning culture</strong>: Can people admit uncertainty safely?</li>
<li><strong>Agency</strong>: Can they influence how success is defined?</li>
<li><strong>Belonging</strong>: Are their contributions welcomed and built upon?</li>
<li><strong>Self-efficacy</strong>: Does the environment increase confidence over time?</li>
</ul>
<p>Think <strong>LABS</strong>.</p>
<p>Teams that create strong cognitive support across these dimensions ship faster, break less, and retain people longer. Measurement infrastructure plays a key role here. When it’s designed well, it makes invisible cognitive contributions visible, benefiting both equity and performance.</p>
</section>
<section id="when-teams-become-less-than-the-sum-of-their-parts" class="level3">
<h3 class="anchored" data-anchor-id="when-teams-become-less-than-the-sum-of-their-parts">When Teams Become Less Than the Sum of Their Parts</h3>
<p>Groups have the potential to outperform individuals, yet research shows they often fail to do so in systematic ways. One of the more robust group biases is the <a href="https://www.uni-muenster.de/imperia/md/content/psyifp/aeechterhoff/vorlesungkommunikation/stasser_titus_unsharedinfogroupdisc_jpsp1985.pdf">tendency to focus on shared information</a> (the information everyone already knows) while overlooking unique insights held by individual members. The problem worsens when teams <a href="https://d1wqtxts1xzle7.cloudfront.net/42226885/Biased_information_search_in_group_decis20160206-14055-1fruvn2-libre.pdf?1454789622=&amp;response-content-disposition=inline%3B+filename%3DBiased_information_search_in_group_decis.pdf">converge prematurely on problem definitions</a>. Once a framing becomes dominant, contradictory information gets filtered out rather than integrated.</p>
<p>Research by <a href="https://www.researchgate.net/profile/Verlin-Hinsz/publication/247408619_The_emerging_conceptualization_of_groups_as_information_processes/links/55563bd408ae6fd2d82360d0/The-emerging-conceptualization-of-groups-as-information-processes.pdf">Hinsz, Tindale, and Vollrath</a> shows what separates effective groups from dysfunctional ones: strong shared mental models. These are things like creating a common understanding of the problem space, clarity about who knows what, and explicit decision processes. In other words, high-performing teams invest in cognitive infrastructure.</p>
<p>When this infrastructure is missing, architectural meetings follow a familiar pattern. The senior engineer who knows the database will buckle assumes others already see the risk. The junior engineer who noticed an edge case doubts themselves. The product manager with critical customer context stays quiet because the discussion feels “technical”. Each person minimizes individual risk, and the collective intelligence of the group collapses.</p>
</section>
</section>
<section id="the-science-of-not-undermining-yourself" class="level2">
<h2 class="anchored" data-anchor-id="the-science-of-not-undermining-yourself">The Science of Not Undermining Yourself</h2>
<section id="small-signals-large-effects" class="level3">
<h3 class="anchored" data-anchor-id="small-signals-large-effects">Small Signals, Large Effects</h3>
<p>Belonging in technical teams doesn’t emerge from grand gestures. Research on microinclusions by <a href="https://www.greggmuragishi.com/uploads/5/7/1/5/57150559/microinclusions_muragishi_et_al_2024.pdf">Muragishi and colleagues</a> shows small, consistent signals that recognize cognitive contributions have an enormous impact.</p>
<p>A simple “Building on what Jane said…” costs seconds and nothing else. Teams that practice these behaviors see dramatically higher retention and engagement. These effects occur because microinclusions increase cognitive visibility. They make mental work legible to others.</p>
<p>In environments where work happens inside people’s heads, recognition becomes infrastructure.</p>
</section>
<section id="the-measurement-trap-and-how-to-avoid-it" class="level3">
<h3 class="anchored" data-anchor-id="the-measurement-trap-and-how-to-avoid-it">The Measurement Trap (And How to Avoid It)</h3>
<p>Most teams measure what’s convenient rather than what’s meaningful. Lines of code tell us little. Story points are easily gamed. Even deployment frequency is ambiguous without context.</p>
<p>As <a href="https://arxiv.org/abs/2305.11030">Lee, Ramsey, and Hicks</a> note, effective measurement requires triangulation. Using multiple indicators helps to converge on the same underlying reality.</p>
<div class="callout callout-style-default callout-tip no-icon callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>Triangulating Truth
</div>
</div>
<div class="callout-body-container callout-body">
<p>Instead of relying on single metrics, try combining:</p>
<ul>
<li><strong>Cycle time</strong> (speed)</li>
<li><strong>Code review discussion quality</strong> (engagement)</li>
<li><strong>Developer confidence scores</strong> (sustainability)</li>
</ul>
<p>Together, they reveal whether you’re seeing fast delivery with engaged developers or a death knell of inevitable collapse</p>
</div>
</div>
<p>Measure capabilities, not just outputs. Measure systems, not individuals. Measure to enable learning, not judgment.</p>
</section>
</section>
<section id="a-three-step-starting-point" class="level2">
<h2 class="anchored" data-anchor-id="a-three-step-starting-point">A Three-Step Starting Point</h2>
<p>Large-scale transformation can wait. Start with these concrete practices.</p>
<section id="implement-a-cognitive-load-budget" class="level3">
<h3 class="anchored" data-anchor-id="implement-a-cognitive-load-budget">1. Implement a Cognitive Load Budget</h3>
<p>You already manage error budgets. Apply the same thinking to cognitive resources.</p>
<p>During planning:</p>
<ul>
<li>Assign cognitive weight to tasks on a simple scale</li>
<li>Establish realistic team capacity</li>
<li>Stop adding work when the budget is reached</li>
</ul>
<p>Teams experimenting with this approach often see velocity increase, not decrease. Fewer tasks completed with full cognitive resources outperform many tasks completed in a depleted state.</p>
</section>
<section id="add-a-cognitive-review-lens" class="level3">
<h3 class="anchored" data-anchor-id="add-a-cognitive-review-lens">2. Add a Cognitive Review Lens</h3>
<p>Before approving a change, ask:</p>
<ul>
<li>Could someone understand this at 3 a.m. during an incident?</li>
<li>How many concepts must be held simultaneously to modify it?</li>
<li>Does this reduce or increase cognitive load for the next person?</li>
</ul>
</section>
<section id="make-cognitive-work-visible" class="level3">
<h3 class="anchored" data-anchor-id="make-cognitive-work-visible">3. Make Cognitive Work Visible</h3>
<p>In meetings and reviews:</p>
<ul>
<li>Explicitly build on others’ contributions</li>
<li>Direct questions to those with relevant expertise</li>
<li>Acknowledge insights publicly</li>
</ul>
<p>These small behaviors create the conditions for collective intelligence.</p>
</section>
</section>
<section id="the-technology-we-deserve" class="level2">
<h2 class="anchored" data-anchor-id="the-technology-we-deserve">The Technology We Deserve</h2>
<div class="callout callout-style-simple callout-note no-icon callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>Richard Hackman’s principle
</div>
</div>
<div class="callout-body-container callout-body">
<p><em>The best leaders are those who create conditions for others to succeed.<br></em> <em>— Paraphrased from </em>Leading Teams: Setting the Stage for Great Performances<em>, <a href="https://books.google.com/books?hl=en&amp;lr=&amp;id=snfoCQAAQBAJ&amp;oi=fnd&amp;pg=PR5&amp;dq=Leading+Teams:+Setting+the+Stage+for+Great+Performances&amp;ots=SCn5qDTc-g">J. Richard Hackman</a></em></p>
<p>The same principle applies to systems: The most effective technology amplifies human cognitive capabilities rather than demanding superhuman ones.</p>
</div>
</div>
<p>Cognitive engineering offers a path toward building technology that amplifies human capabilities rather than demanding superhuman ones. It means designing systems that respect how minds actually work, including their strengths and their limits.</p>
<p>We’re at an inflection point. Systems are growing more complex. AI is entering workflows. Cognitive demands are increasing, not decreasing. We can continue treating human cognition as peripheral, or we can recognize it as the critical infrastructure it’s always been.</p>
</section>
<section id="introducing-the-cognitive-engineering-field-guide" class="level2">
<h2 class="anchored" data-anchor-id="introducing-the-cognitive-engineering-field-guide">Introducing the Cognitive Engineering Field Guide</h2>
<p>This post outlines what becomes possible when we design with cognition in mind. To go deeper, we’re creating the <strong>Cognitive Engineering Field Guide</strong>, a living, open-source manual for building human-aligned systems.</p>
<p>The guide translates decades of research into practical tools covering infrastructure design, team dynamics, measurement, and AI collaboration. It’s open science meeting open source, built as a community resource.</p>
<p>The first chapters launch next month, beginning with Foundations, Infrastructure, Teams, and Measurement.</p>
</section>
<section id="the-choice-is-yours" class="level2">
<h2 class="anchored" data-anchor-id="the-choice-is-yours">The Choice Is Yours</h2>
<p>Every system embodies assumptions about whose cognition matters. Every process either supports or undermines the minds executing it.</p>
<p>Cloudflare and AWS didn’t plan to take down the internet. They had capable engineers, sophisticated systems, and detailed runbooks. What they lacked was systematic attention to cognitive engineering.</p>
<p>The research is clear. The economics are compelling. The path forward is available.</p>
<p>What will you build tomorrow?</p>
<hr>
<p><em>The <strong>Cognitive Engineering Field Guide</strong> launches early 2026! <a href="../../blog.html">Subscribe</a> for updates and early access to chapters as they’re released.</em></p>
<p><em>Have a story about cognitive engineering in practice? I’m actively collecting case studies for the Guide and would love to hear from you! <a href="mailto:contact@jrwinget.com">Share your experience</a> and help build this resource together.</em></p>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2025,
  author = {},
  title = {What {Is} {Cognitive} {Engineering?} {Building} {Technology}
    for {Human} {Minds}},
  date = {2025-12-18},
  url = {https://www.jrwinget.com/blog/2025-12-18_cognitive-engineering/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2025" class="csl-entry quarto-appendix-citeas">
<span>“What Is Cognitive Engineering? Building Technology for Human
Minds.”</span> 2025. December 18, 2025. <a href="https://www.jrwinget.com/blog/2025-12-18_cognitive-engineering/">https://www.jrwinget.com/blog/2025-12-18_cognitive-engineering/</a>.
</div></div></section></div> ]]></description>
  <category>Cognitive Engineering</category>
  <category>Software Development</category>
  <guid>https://www.jrwinget.com/blog/2025-12-18_cognitive-engineering/</guid>
  <pubDate>Thu, 18 Dec 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2025-12-18_cognitive-engineering/featured.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>{bidux} v0.3.3: Where Databases Meet Quick Decisions</title>
  <link>https://www.jrwinget.com/blog/2025-11-20_databases-meet-decisions/</link>
  <description><![CDATA[ 




<p>I’m thrilled to announce that <code>{bidux}</code> <strong>v0.3.3</strong> is now available on CRAN!</p>
<p>This release is about meeting you where you are: Whether that’s knee-deep in a database connection, racing against a deadline with Quarto, or just needing a quick UX fix without the ceremony. Because behavioral insight shouldn’t require a behavioral scientist in the room.</p>
<section id="your-database-your-telemetry-your-way" class="level2">
<h2 class="anchored" data-anchor-id="your-database-your-telemetry-your-way">Your Database, Your Telemetry, Your Way</h2>
<p>Let’s be honest: Most of our telemetry doesn’t live in neat CSV files. It lives in production databases, behind connection strings, in tables with names like <code>user_events_v2_final_FINAL</code>.</p>
<p>With <strong>v0.3.3</strong>, <code>bid_telemetry()</code> and <code>bid_ingest_telemetry()</code> now speak <code>{DBI}</code> fluently. Pass in your existing database connection (e.g., SQLite, PostgreSQL, MySQL, whatever you’ve got) and <code>{bidux}</code> will work with it directly:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(bidux)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(DBI)</span>
<span id="cb1-3"></span>
<span id="cb1-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Your existing database connection (that you definitely already have open)</span></span>
<span id="cb1-5">con <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">dbConnect</span>(RSQLite<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">SQLite</span>(), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"production.db"</span>)</span>
<span id="cb1-6"></span>
<span id="cb1-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Old way: Export to file, then analyze (ugh)</span></span>
<span id="cb1-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># New way: Just point {bidux} at your connection</span></span>
<span id="cb1-9">issues <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_telemetry</span>(</span>
<span id="cb1-10">  con,</span>
<span id="cb1-11">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">table_name =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"shiny_events"</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># auto-detection is nice, but control is nicer</span></span>
<span id="cb1-12">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">thresholds =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_telemetry_presets</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"moderate"</span>)</span>
<span id="cb1-13">)</span>
<span id="cb1-14"></span>
<span id="cb1-15"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># The connection stays open: {bidux} doesn't presume to manage your resources</span></span>
<span id="cb1-16"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">dbDisconnect</span>(con)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># When YOU'RE ready</span></span></code></pre></div></div>
</details>
</div>
<p>This isn’t just about convenience (though it is convenient). It’s about reducing the friction between noticing a problem and fixing it. Every export-import cycle is a chance for good intentions to die in a TODO list.</p>
</section>
<section id="quick-suggestions-for-real-humans" class="level2">
<h2 class="anchored" data-anchor-id="quick-suggestions-for-real-humans">Quick Suggestions for Real Humans</h2>
<p>Sometimes you don’t need a five-stage behavioral intervention. Sometimes you just need to know: “Should this be a card layout or tabs?”</p>
<p>Enter <code>bid_quick_suggest()</code>, the function that respects your time:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># When your standup is in 10 minutes and users are confused</span></span>
<span id="cb2-2">suggestions <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_quick_suggest</span>(</span>
<span id="cb2-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Users can't find the export button after running analysis"</span></span>
<span id="cb2-4">)</span>
<span id="cb2-5"></span>
<span id="cb2-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get immediate, ranked suggestions you can actually implement</span></span>
<span id="cb2-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>(suggestions)</span>
<span id="cb2-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># ✓ Add bslib::tooltip() to explain export availability</span></span>
<span id="cb2-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># ✓ Use bslib::card_header() to group export with results</span></span>
<span id="cb2-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># ✓ Consider shinyjs::toggleState() to enable only when ready</span></span></code></pre></div></div>
</details>
</div>
<p>No stages. No ceremony. Just behavioral science applied to your specific problem, because sometimes the perfect is the enemy of the good-enough-to-ship-today.</p>
</section>
<section id="structure-suggestions-you-can-actually-filter" class="level2">
<h2 class="anchored" data-anchor-id="structure-suggestions-you-can-actually-filter">Structure Suggestions You Can Actually Filter</h2>
<p>The Structure stage now returns suggestions in a format that plays nicely with your existing data wrangling reflexes. Every suggestion comes as both the rich nested format and a practical tibble:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1">structure_result <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_structure</span>(</span>
<span id="cb3-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">previous_stage =</span> anticipate_result,</span>
<span id="cb3-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">telemetry_flags =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_flags</span>(issues)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Real usage data informing suggestions</span></span>
<span id="cb3-4">)</span>
<span id="cb3-5"></span>
<span id="cb3-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Find the low-hanging fruit</span></span>
<span id="cb3-7">quick_wins <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> structure_result<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>suggestions_tbl <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb3-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(difficulty <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Easy"</span>, score <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.7</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb3-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(title, components, rationale) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb3-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb3-11"></span>
<span id="cb3-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Or focus on navigation issues</span></span>
<span id="cb3-13">nav_improvements <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> structure_result<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>suggestions_tbl <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb3-14">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(category <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Navigation"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb3-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">desc</span>(score))</span></code></pre></div></div>
</details>
</div>
<p>This isn’t just about making data rectangular; it’s about making insights actionable. When you can <code>filter()</code> and <code>arrange()</code> your way to the next improvement, you’re more likely to actually make it.</p>
</section>
<section id="better-sources-better-science" class="level2">
<h2 class="anchored" data-anchor-id="better-sources-better-science">Better Sources, Better Science</h2>
<p>We’ve also strengthened our theoretical foundations. The Beautiful-Is-Good Stereotype now properly cites both its social psychology origins (Dion et al., 1972) and its UX applications (Tractinsky et al., 2000). The Data Storytelling Framework references the actual visualization experts (Knaflic, 2015; Dykes, 2020).</p>
<p>These become useful breadcrumbs for when you need to dig deeper, to defend a design decision, or to learn more about why that suggestion actually works.</p>
</section>
<section id="quarto-dashboards-weve-got-you" class="level2">
<h2 class="anchored" data-anchor-id="quarto-dashboards-weve-got-you">Quarto Dashboards? We’ve Got You</h2>
<p>The documentation now explicitly supports Quarto dashboard developers. While <code>{bidux}</code> was born in the Shiny world, most of our suggestions translate beautifully to Quarto’s static and <code>server: shiny</code> contexts:</p>
<ul>
<li><strong>Static Quarto dashboards</strong>: Layout suggestions, component recommendations, and the full BID framework apply directly</li>
<li><strong>Interactive Quarto</strong> (with <code>server: shiny</code>): Everything works, including telemetry integration</li>
</ul>
<p>Because good UX principles don’t care about your rendering engine.</p>
</section>
<section id="a-tool-that-respects-your-context" class="level2">
<h2 class="anchored" data-anchor-id="a-tool-that-respects-your-context">A Tool That Respects Your Context</h2>
<p>What I love about this release is how it meets people where they are. Got a database? Connect directly. Need something quick? <code>bid_quick_suggest()</code>. Want to integrate with your dplyr workflow? Here’s a tibble. Building in Quarto? We support that too.</p>
<p>This is what cognitive engineering looks like in practice: Reducing the cognitive load of reducing cognitive load. Making the right thing the easy thing.</p>
</section>
<section id="install-and-explore" class="level2">
<h2 class="anchored" data-anchor-id="install-and-explore">Install and Explore</h2>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bidux"</span>)</span></code></pre></div></div>
</details>
</div>
<p>For the full details, including migration notes and comprehensive examples, check out the <a href="https://github.com/jrwinget/bidux/releases/tag/0.3.3">release on GitHub</a>.</p>
</section>
<section id="whats-next" class="level2">
<h2 class="anchored" data-anchor-id="whats-next">What’s Next?</h2>
<p>Every release teaches us something about how people actually use behavioral science in production. Your telemetry databases, your quick fixes, your Quarto dashboards: They all inform where <code>{bidux}</code> goes next.</p>
<p>So please, <a href="https://github.com/jrwinget/bidux/issues">open issues</a>, share your workflows, tell us what friction you’re facing. Because the best tools are built in conversation with the people who use them.</p>
<p>Happy designing, and remember: Sometimes the best behavioral intervention is the one that ships today 🚀</p>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2025,
  author = {},
  title = {`\{Bidux\}` V0.3.3: {Where} {Databases} {Meet} {Quick}
    {Decisions}},
  date = {2025-11-20},
  url = {https://www.jrwinget.com/blog/2025-11-20_databases-meet-decisions/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2025" class="csl-entry quarto-appendix-citeas">
<span>“`{Bidux}` V0.3.3: Where Databases Meet Quick Decisions.”</span>
2025. November 20, 2025. <a href="https://www.jrwinget.com/blog/2025-11-20_databases-meet-decisions/">https://www.jrwinget.com/blog/2025-11-20_databases-meet-decisions/</a>.
</div></div></section></div> ]]></description>
  <category>Software Development</category>
  <category>User Experience</category>
  <guid>https://www.jrwinget.com/blog/2025-11-20_databases-meet-decisions/</guid>
  <pubDate>Thu, 20 Nov 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2025-11-20_databases-meet-decisions/featured.png" medium="image" type="image/png" height="167" width="144"/>
</item>
<item>
  <title>When Groups Get Stuck on the Wrong Problem</title>
  <link>https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/</link>
  <description><![CDATA[ 




<section id="a-crowd-a-crisis-and-the-wrong-problem" class="level2">
<h2 class="anchored" data-anchor-id="a-crowd-a-crisis-and-the-wrong-problem">A Crowd, A Crisis, and the Wrong Problem</h2>
<p>The crowds gathered fast. Flags, chants, phone cameras, all aimed squarely at the latest political flashpoint. The air pulsed with conviction. Something needed to be done <em>now</em>.</p>
<p>On the surface, it looked like decisive action. Underneath, something quieter was happening: a narrowing of attention, a collapse of deliberation, and the slow suffocation of the space for dissent.</p>
<p><img src="https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/img/broadview.png" class="float-end rounded shadow-sm img-fluid" style="width:45.0%" alt="Illustration showing a broad perspective view of multiple interconnected systems and factors, representing how collective attention can narrow and miss underlying systemic issues"></p>
<p>When a society starts rewarding volume over substance, when disagreement feels like betrayal, groups of every size (from governments to dev teams) begin solving the wrong problems. The visible threat becomes the total focus, while the underlying systems quietly degrade.</p>
<p>Protest and mobilization have always been vital forces for change. But when attention is hijacked by spectacle rather than substance, even the most righteous energy can be redirected away from root causes. Especially when those in power find it useful to change the subject.</p>
</section>
<section id="how-groups-actually-think" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="how-groups-actually-think">How Groups Actually Think</h2>
<p>Groups aren’t irrational; they’re just wired for coordination first and accuracy second.</p>
<p>Decades of research by scholars like Ivan Steiner, James Davis, Scott Tindale, and Verlin Hinsz<sup>1</sup> show groups behave less like committees of thinkers and more like <strong>information processors</strong>: limited, distributed, and biased toward what’s already shared among its members.</p>
<p>In easy tasks, that works beautifully. Shared information dominates the discussion, members align, and confidence rises. But in complex tasks, where critical information is unevenly distributed, the same bias backfires. Members repeat what’s already known, while unshared facts stay buried in individual minds. The group, ironically, becomes <em>less</em> intelligent than its smartest member.</p>

<div class="no-row-height column-margin column-container"><div class="">
<p>This is the <strong>hidden-profile problem</strong>: when the best solution requires integrating information that no single member fully possesses, groups systematically fail unless structures force information sharing.</p>
</div></div><p>That’s why systems designed for consensus can unintentionally hide insight. They reduce friction but also filter out the unusual, the dissenting, and the diagnostic. It’s the hidden-profile problem in every form, from juries to code reviews to global policy debates.</p>
<div class="callout callout-style-simple callout-note no-icon callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>Richard Hackman’s rule<sup>2</sup>
</div>
</div>
<div class="callout-body-container callout-body">
<p><br> Performance isn’t about motivation or personality; it’s about <em>design.</em><br> Groups fail because their structures make the right conversations impossible.</p>
</div>
</div>
</section>
<section id="the-comfort-of-shared-myopia" class="level2">
<h2 class="anchored" data-anchor-id="the-comfort-of-shared-myopia">The Comfort of Shared Myopia</h2>
<p>Most groups don’t consciously suppress new information. They just drift toward what’s easy to agree on.</p>
<p>The bias runs deep: Sharedness feels safe. Familiar ideas reward us with microbursts of social approval. Each nod and “exactly!” strengthens the illusion that consensus equals truth.</p>
<p><img src="https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/img/pluralistic-ignorance.png" class="float-end rounded shadow-sm img-fluid" style="width:45.0%" alt="Diagram showing pluralistic ignorance: multiple figures with thought bubbles where outer bubbles show agreement with the group but inner thoughts reveal private doubts, illustrating how silent agreement masks private disagreement"></p>
<p>Pluralistic ignorance amplifies this. Everyone privately doubts the shared focus, but no one wants to be the first to say so. And so, silence masquerades as agreement. Even those who <em>do</em> see the structural issue start calibrating their language, testing the wind before speaking up.</p>
<p>You’ve seen this in the workplace too. A dev team argues for hours about a UI color but defers a looming database overhaul. The visible topic feels more manageable, more social, more <em>presentable</em>. Under pressure to align, attention collapses to the path of least resistance.</p>
<p>In social systems and engineering alike, the group’s collective gaze narrows exactly when it needs to widen.</p>
</section>
<section id="the-minority-report" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="the-minority-report">The Minority Report</h2>
<p>This is where minority influence enters the picture. And where Tindale’s work becomes vital again.</p>
<p>Contrary to popular belief, minority influence isn’t about defiance. It’s about <strong>maintenance</strong>. Groups need dissent not because it feels good, but because it keeps the system adaptive. Minorities serve as informational scouts, probing the blind spots created by majority dynamics.</p>
<p>When dissenters persist with clarity and evidence, they do something remarkable: They <em>reframe</em> the problem space!</p>

<div class="no-row-height column-margin column-container"><div class="margin-aside">
<p>Moscovici’s studies showed that even a single consistent dissenter can shift the majority’s private judgments, even when public positions remain unchanged.</p>
</div></div><p>Serge Moscovici<sup>3</sup> called this <strong>conversion</strong>, a quiet shift in others’ internal representations. The surface debate might stay unchanged, but the cognitive landscape underneath reorganizes. People start <em>thinking differently</em>, even if they don’t admit it out loud.</p>
<p>In political life, this can look like a single journalist or local mayor calling out creeping authoritarianism long before it trends. In software development, it’s the engineer who refuses to cut corners on testing because she’s seen what happens when systems fail silently. Different stakes, same mechanism: Dissent preserves information diversity, the lifeblood of collective intelligence.</p>
</section>
<section id="the-real-threat" class="level2">
<h2 class="anchored" data-anchor-id="the-real-threat">The Real Threat</h2>
<p>Authoritarianism thrives on the suppression of dissent not just for control, but because it <em>simplifies cognition</em>. When loyalty replaces deliberation, systems get faster…but dumber. They trade complexity for coordination, nuance for narrative.</p>
<p>That same tradeoff happens in corporate culture when “alignment” becomes a euphemism for “agreement”. A perfectly aligned team can ship a perfectly flawed product. Everyone’s happy…until the structure breaks under the weight of the unspoken.</p>
<p>Hackman would say the fix isn’t to exhort people to “speak up”. It’s to <strong>redesign the environment</strong> so that dissent doesn’t require heroism.</p>
<div class="callout callout-style-default callout-tip no-icon callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>Quick wins for safer dissent
</div>
</div>
<div class="callout-body-container callout-body">
<ul>
<li>Structure meetings so unique information is surfaced <em>first</em>, before discussion</li>
<li>Assign a rotating “minority advocate” in critical design reviews</li>
<li>Value <em>diagnostic questions</em> as much as polished answers</li>
</ul>
<p>These may sound procedural, but they’re moral architecture. They protect the system’s capacity for truth.</p>
</div>
</div>
</section>
<section id="why-groups-chase-the-visible-target" class="level2">
<h2 class="anchored" data-anchor-id="why-groups-chase-the-visible-target">Why Groups Chase the Visible Target</h2>
<p>Let’s go back to that protest scene. Every chant, every viral post, every demand for “decisive action” is a bid for cognitive simplicity. People want to make sense of chaos. They want to see the problem, name it, and fix it.</p>
<p>That impulse isn’t wrong; it’s human. Collective outrage is often the first spark of accountability. The danger comes when leaders or algorithms capture that spark and aim it somewhere safer for power.</p>
<p>Leaders often exploit this tendency, redirecting attention to symbolic skirmishes while dismantling the systems that actually distribute power. The crowd doesn’t see the scaffolding behind the spectacle: bureaucratic hollowing, erosion of norms, loss of institutional memory. They’re staring at the storm, not the climate.</p>
<p>You can watch this in microcosm every day inside organizations. A leadership team zeroes in on quarterly optics while ignoring crumbling infrastructure. A product team launches a glossy feature that hides the absence of documentation.</p>
<p>When attention becomes performance, groups start mistaking <em>visibility</em> for <em>impact</em>.</p>
<p>The cure is not cynicism; it’s <strong>redesign</strong>. Make dissent cheap. Make invisible work visible. Create conditions where the most informative signal isn’t the loudest one.</p>
</section>
<section id="designing-for-dissent" class="level2">
<h2 class="anchored" data-anchor-id="designing-for-dissent">Designing for Dissent</h2>
<p>So how do we design for dissent without creating chaos?</p>
<p>Hackman’s decades of research give us a deceptively simple blueprint: <em>Structure, clarity, and purpose are the enabling conditions of collective intelligence.</em></p>
<div class="callout callout-style-simple callout-important no-icon">
<div class="callout-body d-flex">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-body-container">
<ol type="1">
<li><p><strong>Clarify the real task:</strong> Many coordination failures stem from misaligned problem definitions. Before debate, define what “good” looks like and who holds which pieces of information.</p></li>
<li><p><strong>Make unique information explicit:</strong> Start meetings or decisions with a quick round of “what do you know that others might not?” It flattens status cues and seeds information diversity.</p></li>
<li><p><strong>Institutionalize dissent:</strong> Rotate a “devil’s advocate” or “minority role.” Signal that critique is contribution, not disloyalty.</p></li>
<li><p><strong>Reward reframing:</strong> Give airtime not only to solutions but to people who <em>redefine the problem</em> accurately.</p></li>
<li><p><strong>Protect deliberation bandwidth:</strong> Deadlines and visibility metrics compress attention. Allocate protected time for slow, high-value thinking.</p></li>
</ol>
</div>
</div>
</div>
<p>These are structural levers, not personality hacks. They let courage scale without burning out the people who supply it.</p>
</section>
<section id="when-the-group-turns-on-itself" class="level2">
<h2 class="anchored" data-anchor-id="when-the-group-turns-on-itself">When the Group Turns on Itself</h2>
<p>There’s a haunting irony in collective processes: The same cohesion that helps a group survive can make it fragile to error.</p>
<p>As coordination improves, the social cost of dissent rises. Eventually, good people start self-silencing, assuming someone else will say what needs saying. But they don’t. And slowly, the system forgets how to self-correct. You can see this in collapsing democracies, in failing companies, in scientific replication crises, even in open-source communities.</p>
<p>The details change, but the dynamic is identical: When sharedness becomes the goal, truth becomes a casualty.</p>
<p>Dr.&nbsp;Cat Hicks<sup>4</sup> has described curiosity as a shared resource. Dissent is too. It’s how groups stay honest about what they know and what they don’t. But unlike curiosity, dissent doesn’t feel communal. It feels risky.</p>
<p>That’s why structure matters. It lowers the <em>social tax</em> of honesty.</p>
</section>
<section id="reframing-whats-worth-talking-about" class="level2">
<h2 class="anchored" data-anchor-id="reframing-whats-worth-talking-about">Reframing What’s Worth Talking About</h2>
<p><img src="https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/featured.png" class="float-start rounded shadow-sm img-fluid" style="width:42.0%" alt="Collage representing group dynamics and systems thinking, with interconnected elements and human figures engaged in dialogue about collective intelligence and decision-making"></p>
<p>Imagine if our public discourse treated dissent the way high-reliability teams treat anomalies: as signals to investigate, not threats to silence. Imagine if our teams treated disagreement as a sign that the system is alive.</p>
<p>Protest, dissent, and debate are how we remind systems that truth is still a shared project. Because dissent, at its best, is an act of stewardship.</p>
<p>It’s the refusal to let collective attention decay into convenience. It’s not rebellion; it’s repair.</p>
<p>So the next time you’re in a meeting, a debate, or a crowd, notice what everyone’s staring at…and what no one’s mentioning. That’s often where the real work begins.</p>
<p><em>(And if you’re reading this on Halloween, consider it your annual reminder that sometimes the scariest thing in a group isn’t the conflict: It’s the silence.)</em></p>
</section>
<section id="closing-thoughts" class="level2">
<h2 class="anchored" data-anchor-id="closing-thoughts">Closing Thoughts</h2>
<p>When groups get stuck on the wrong problem, it’s rarely because they’re malicious or ignorant. It’s because sharedness feels safe, and dissent feels costly.</p>
<p>But every complex system, whether a government, a research team, or a startup, depends on its ability to surface unshared truths. The question isn’t whether we’ll disagree. It’s whether we’ve designed the conditions that let disagreement matter.</p>
<p>If democracy is a system for making collective decisions, then teams are its smallest working model. Get the micro right, and the macro starts to follow.</p>
<p>Every movement for justice begins as minority influence. One small group insisting the system can do better.</p>
<p>And maybe that’s the lesson tucked inside all this research. Collective intelligence doesn’t come from harmony. It comes from friction handled well.</p>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>Key figures in group information processing theory and social decision schemes research↩︎</p></li>
<li id="fn2"><p>J. Richard Hackman, pioneering researcher on team effectiveness and organizational design↩︎</p></li>
<li id="fn3"><p>Social psychologist who developed the theory of minority influence and social representations↩︎</p></li>
<li id="fn4"><p>Researcher studying developer experience, social dynamics in technical teams, and organizational learning↩︎</p></li>
</ol>
</section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2025,
  author = {},
  title = {When {Groups} {Get} {Stuck} on the {Wrong} {Problem}},
  date = {2025-10-31},
  url = {https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2025" class="csl-entry quarto-appendix-citeas">
<span>“When Groups Get Stuck on the Wrong Problem.”</span> 2025. October
31, 2025. <a href="https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/">https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/</a>.
</div></div></section></div> ]]></description>
  <category>Behavioral Science</category>
  <guid>https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/</guid>
  <pubDate>Fri, 31 Oct 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2025-10-31_when-groups-get-stuck/featured.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>{bidux} v0.3.2: Leaner, Quieter, More Intuitive BID Workflows</title>
  <link>https://www.jrwinget.com/blog/2025-10-29_leaner-quieter-intuitive/</link>
  <description><![CDATA[ 




<p>I’m excited to announce that <code>{bidux}</code> <strong>v0.3.2</strong> is now available on CRAN!</p>
<p>This update is a polishing release that focuses on <strong>API ergonomics and package optimization</strong>. We’ve flattened out old complexities (goodbye nested lists!), added knobs to fine-tune telemetry with ease, and even given you a “mute button” for verbose output. All these changes maintain <strong>100% backward compatibility</strong>, so you get smoother workflows without breaking your existing code.</p>
<section id="flattening-the-story-intuitive-data-stories" class="level2">
<h2 class="anchored" data-anchor-id="flattening-the-story-intuitive-data-stories">Flattening the Story: Intuitive Data Stories</h2>
<p>Crafting a BID data story in <strong>v0.3.2</strong> is more intuitive. The <code>new_data_story()</code> function now uses a <strong>flat API</strong> with four straightforward arguments (<code>hook</code>, <code>context</code>, <code>tension</code>, <code>resolution</code>), instead of nesting parts of the story inside.</p>
<p>In other words, you can directly specify the key elements of your story without the cognitive overhead of wrapping them in <code>variables</code> and <code>relationships</code> lists. This covers about 80% of use cases with a cleaner syntax:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Before (nested format, now deprecated)</span></span>
<span id="cb1-2">story_old <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new_data_story</span>(</span>
<span id="cb1-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">context =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dashboard usage dropped"</span>,</span>
<span id="cb1-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">variables =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">hook =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"User engagement declining"</span>),</span>
<span id="cb1-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">relationships =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resolution =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Analyze telemetry"</span>)</span>
<span id="cb1-6">)</span>
<span id="cb1-7"></span>
<span id="cb1-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># After (flat format, v0.3.2+)</span></span>
<span id="cb1-9">story <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new_data_story</span>(</span>
<span id="cb1-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">hook =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"User engagement declining"</span>,</span>
<span id="cb1-11">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">context =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dashboard usage dropped 30%"</span>,</span>
<span id="cb1-12">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">tension =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Unsure if UX or user needs are causing the drop"</span>,</span>
<span id="cb1-13">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resolution =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Analyze telemetry for friction points"</span></span>
<span id="cb1-14">)</span></code></pre></div></div>
</details>
</div>
<p>The new flat syntax is not only easier to read, it’s also future-proof. The old nested format still works for now, but will trigger a deprecation warning and is slated for removal in 0.4.0.</p>
<p>I recommend updating your code to the flat style soon. Your future self will thank you.</p>
<p>On the bright side, all other BID functions seamlessly accept the new bid_data_story objects, and printing a story now neatly lists out the hook, context, tension, and resolution for a quick overview.</p>
</section>
<section id="telemetry-tuning-made-easy" class="level2">
<h2 class="anchored" data-anchor-id="telemetry-tuning-made-easy">Telemetry Tuning Made Easy</h2>
<p>If you’ve been ingesting telemetry data to inform your designs, <code>{bidux}</code> v0.3.2 introduces a welcome convenience: <strong>telemetry sensitivity presets</strong>.</p>
<p>The new <code>bid_telemetry_presets()</code> function gives you pre-configured threshold sets (<code>"strict"</code>, <code>"moderate"</code>, <code>"relaxed"</code>) for flagging UX issues. This means you no longer need to hand-tweak what counts as an “unused input” or a “delay” in user interactions. Just pick a preset that fits your scenario:</p>
<ul>
<li><strong>Strict:</strong> Use for brand-new or mission-critical dashboards to catch even minor friction (e.g.&nbsp;flags inputs used by &lt; 2% of sessions and delays over ~20 seconds).</li>
<li><strong>Moderate:</strong> The default balance for most applications, offering a good signal-to-noise ratio (e.g.&nbsp;flags inputs under ~5% usage, delays over ~30 seconds).</li>
<li><strong>Relaxed:</strong> For mature, stable dashboards where only major issues merit attention (e.g.&nbsp;flags inputs under ~10% usage, delays over ~60 seconds).</li>
</ul>
<p>These presets let you adapt to your app’s context with one line of code. For example, a brand new internal app might start with <code>bid_telemetry_presets("strict")</code> to catch early pain points, whereas a long-running product dashboard might use <code>"relaxed"</code> to focus only on high-impact issues. Under the hood, each preset just supplies a list of threshold values to the <code>thresholds</code> argument of <code>bid_telemetry()</code>, so you retain full control if you ever need a custom tweak.</p>
</section>
<section id="enjoy-the-silence-quiet-mode" class="level2">
<h2 class="anchored" data-anchor-id="enjoy-the-silence-quiet-mode">Enjoy the Silence: Quiet Mode</h2>
<p>Do you ever feel inundated by informational messages when running BID analyses or pipelines? In v0.3.2, quiet mode is here to help.</p>
<p>Every BID stage function (from <code>bid_interpret()</code> through <code>bid_validate()</code>) gains a new quiet parameter. Set <code>quiet = TRUE</code> and those friendly yet noisy console messages will be suppressed. This can make a huge difference when running complex pipelines or rendering Shiny apps where clean logs are gold.</p>
<p>For example, you can wrap an entire sequence of stages in silence:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(bidux)</span>
<span id="cb2-2"></span>
<span id="cb2-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Globally turn on quiet mode for the whole workflow</span></span>
<span id="cb2-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_set_quiet</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># new helper to set quietness across all functions</span></span>
<span id="cb2-5"></span>
<span id="cb2-6">result <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_interpret</span>(..., <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">quiet =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb2-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_notice</span>(..., <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">quiet =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb2-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_anticipate</span>(..., <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">quiet =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb2-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_structure</span>(..., <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">quiet =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb2-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_validate</span>(..., <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">quiet =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span></code></pre></div></div>
</details>
</div>
<p>Rather not sprinkle <code>quiet = TRUE</code> everywhere? You can also use <code>bid_with_quiet()</code> to temporarily suppress messages within a block of code, or configure the default with <code>bid_set_quiet()</code> and <code>bid_get_quiet()</code> at a global level.</p>
<p>Now you’re in charge of the verbosity. Run noisy when debugging, then go quiet when integrating into a polished Shiny app or R Markdown report.</p>
<div class="float-end callout callout-style-default callout-note callout-titled" title="Shiny Developers: Leaner, Quieter Apps" width="50%">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>Shiny Developers: Leaner, Quieter Apps
</div>
</div>
<div class="callout-body-container callout-body">
<p>If you use <code>{bidux}</code> in Shiny, this release makes life easier. Removing heavy dependencies like <code>{purrr}</code> and <code>{stringr}</code> has slimmed down the package, which means lighter-weight deployments. And with the new quiet mode, you can keep your app logs clean: No more spamming the console with messages during user sessions. Focus on your users, not the chatter!</p>
</div>
</div>
</section>
<section id="new-s3-helpers-and-print-enhancements" class="level2">
<h2 class="anchored" data-anchor-id="new-s3-helpers-and-print-enhancements">New S3 Helpers and Print Enhancements</h2>
<p>To further modernize the API, <strong>v0.3.2</strong> introduces new S3 constructors for key BID components. You can now create user personas and bias mitigation sets with dedicated functions: <code>new_user_personas()</code> and <code>new_bias_mitigations()</code>. These expect a data frame with the appropriate columns (e.g.&nbsp;name, goals, pain_points, etc. for personas) and return structured S3 objects.</p>
<p>The benefit? <strong>Type-safe, validated objects</strong> that play nicely with BID workflows. If you’ve been using raw lists for personas or biases, these helpers provide a clear migration path to a more robust format (and they’ll warn you if required fields are missing or misformatted).</p>
<p>Printing of BID objects also got a polish. For example, printing a <code>bid_data_story</code> now shows a tidy summary with each element on its own line, rather than a raw list dump. Likewise, <code>bid_user_personas</code> and <code>bid_bias_mitigations</code> objects print with a concise header (e.g.&nbsp;“bidux user personas (3 personas):”) followed by a tibble of the contents. This makes inspecting these objects in the console much more reader-friendly. Little quality-of-life improvements like this help you quickly verify that your inputs are set up correctly.</p>
</section>
<section id="under-the-hood-leaner-and-cleaner" class="level2">
<h2 class="anchored" data-anchor-id="under-the-hood-leaner-and-cleaner">Under the Hood: Leaner and Cleaner</h2>
<p>This release also <strong>trims some fat and cleans up behind the scenes</strong> to make the package more efficient:</p>
<ul>
<li><strong>Reduced dependencies:</strong> We removed unused packages <code>{purrr}</code> and <code>{stringr}</code> from our imports, opting for base R solutions. This reduces installation size and the risk of version conflicts, so your projects load a bit faster and ship with fewer baggage.</li>
<li><strong>Better error messages:</strong> We’ve adopted the <code>glue</code> and <code>cli</code> packages for error handling. Now errors come with clearer context and suggestions (using <code>cli::cli_abort()</code> with our new <code>standard_error_msg()</code> helper). When something goes wrong, the feedback will be more informative and actionable than the old generic <code>stop()</code> messages.</li>
<li><strong>Text normalization:</strong> Various text-processing utilities in bidux have been refined to handle whitespace and special characters more gracefully. This means more reliable matching when generating suggestions or interpreting inputs, especially in edge cases.</li>
<li><strong>Code refactoring:</strong> We reorganized internal code into domain-specific modules (messaging, validation, stages, safe access). While invisible to you as a user, this modular structure makes bidux easier to maintain and test, which ultimately leads to more robust features.</li>
<li><strong>Telemetry constants:</strong> All those “magic numbers” for telemetry thresholds are now named constants at the top of the telemetry module. This cleanup makes the logic more transparent and sets the stage for easier tuning (for us and adventurous contributors).</li>
</ul>
<p>Crucially, <strong>none of these changes break existing functionality</strong>. They either improve performance, clarity, or developer experience without altering how you call the functions. It’s all gain, no pain.</p>
</section>
<section id="documentation-and-resources" class="level2">
<h2 class="anchored" data-anchor-id="documentation-and-resources">Documentation and Resources</h2>
<p>To help you get the most out of these updates, we’ve expanded and refreshed the documentation:</p>
<ul>
<li><strong>New “API Modernization” vignette:</strong> A comprehensive guide for migrating from legacy list-based usage to the new S3 classes and flat APIs. This is a great starting point if you’re upgrading from an older version and want to adopt best practices (with plenty of examples for <code>data_story</code>, <code>user_personas</code>, and <code>bias_mitigations</code>).</li>
<li><strong>Simplified “Advanced Workflows” vignette:</strong> We’ve updated our advanced examples to showcase these new features in action. This includes how to use telemetry presets in different contexts and how to utilize quiet mode in complex pipelines. If you’re pushing bidux to its limits (e.g.&nbsp;analyzing dozens of dashboards in batch), check out this vignette for patterns and tips.</li>
<li><strong>Updated “Getting Started” vignette:</strong> New users will find the getting started guide now uses the flat <code>new_data_story()</code> and data frame-based personas from the outset. We’ve also ensured the examples reflect the current recommended usage (no outdated list formats) and highlighted the corrected BID stage ordering that was fixed in v0.3.1.</li>
<li><strong>Function reference examples:</strong> Key functions like <code>bid_interpret()</code> now include examples demonstrating both the modern usage and how to handle legacy inputs. Our goal is to guide users toward the new approaches while still acknowledging the old way for those in transition.</li>
</ul>
<p>As always, the reference documentation is available and has been updated with all new parameters (like <code>quiet</code>) and functions. And behind the scenes, we bolstered test coverage for all these changes, so you can upgrade with confidence.</p>
</section>
<section id="migration-and-backward-compatibility" class="level2">
<h2 class="anchored" data-anchor-id="migration-and-backward-compatibility">Migration and Backward Compatibility</h2>
<p>Upgrading to v0.3.2 should be seamless. All your code from v0.3.1 (and even v0.3.0) will continue to run, thanks to careful backward compatibility. If you do use any now-deprecated patterns, bidux will politely prod you with warnings and guidance:</p>
<ul>
<li><strong>Nested data_story format:</strong> If you pass <code>variables</code> or <code>relationships</code> into <code>new_data_story()</code>, you’ll get a deprecation warning. The function will still return a valid story object (so your pipeline won’t break), but consider this your nudge to switch to the flat syntax before 0.4.0.</li>
<li><strong>Legacy telemetry ingestion:</strong> The old <code>bid_ingest_telemetry()</code> function remains available for those who haven’t migrated to the new <code>bid_telemetry()</code> workflow. It continues to return the hybrid list-like object as before. However, we signaled in v0.3.1 that a shift toward <code>bid_telemetry()</code> is the future, and that remains true. Plan to update to the tidy tibble approach when convenient.</li>
<li><strong>List-based personas and mitigations:</strong> You can still supply user personas as plain lists of lists, and bias mitigations as raw lists. Internally, bidux will convert or migrate these to the new S3 objects on the fly. You might see messages suggesting to use <code>new_user_personas()</code> or <code>new_bias_mitigations()</code> going forward, but your results will be the same. This gentle transition means you can adopt the new helpers at your own pace.</li>
</ul>
<p>In short, <strong>no breaking changes</strong> in v0.3.2. We’ve laid the groundwork for 0.4.0 deprecations but provided a smooth on-ramp.</p>
<p>If you run into any issues while upgrading, please check the <strong>API Modernization vignette</strong> or reach out on the GitHub repo.</p>
</section>
<section id="install-the-latest-release" class="level2">
<h2 class="anchored" data-anchor-id="install-the-latest-release">Install the Latest Release</h2>
<p>You can install <code>{bidux}</code> <strong>v0.3.2</strong> from CRAN today:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bidux"</span>)</span></code></pre></div></div>
</details>
</div>
<p>As a minor release after v0.3.1, this version focuses on refining your experience: Making the API more intuitive, reducing friction from extraneous output, and keeping the package lean and efficient.</p>
<p>The goal is to let you spend less time wrangling the tool and more time applying <strong>Behavioral Insight Design</strong> to build better Shiny apps and dashboards. We hope these improvements make your BID workflows smoother and more enjoyable.</p>
<p>Happy designing! 🚀</p>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2025,
  author = {},
  title = {`\{Bidux\}` V0.3.2: {Leaner,} {Quieter,} {More} {Intuitive}
    {BID} {Workflows}},
  date = {2025-10-28},
  url = {https://www.jrwinget.com/blog/2025-10-29_leaner-quieter-intuitive/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2025" class="csl-entry quarto-appendix-citeas">
<span>“`{Bidux}` V0.3.2: Leaner, Quieter, More Intuitive BID
Workflows.”</span> 2025. October 28, 2025. <a href="https://www.jrwinget.com/blog/2025-10-29_leaner-quieter-intuitive/">https://www.jrwinget.com/blog/2025-10-29_leaner-quieter-intuitive/</a>.
</div></div></section></div> ]]></description>
  <category>Software Development</category>
  <category>User Experience</category>
  <guid>https://www.jrwinget.com/blog/2025-10-29_leaner-quieter-intuitive/</guid>
  <pubDate>Tue, 28 Oct 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2025-10-29_leaner-quieter-intuitive/featured.png" medium="image" type="image/png" height="167" width="144"/>
</item>
<item>
  <title>From First-Time Attendee to Speaker: My posit::conf Journey</title>
  <link>https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/</link>
  <description><![CDATA[ 




<section id="a-cold-chicago-winter-a-warm-san-diego-welcome" class="level2">
<h2 class="anchored" data-anchor-id="a-cold-chicago-winter-a-warm-san-diego-welcome">A Cold Chicago Winter, a Warm San Diego Welcome</h2>
<p>My first rstudio::conf was 2018 in San Diego, late January. Coming from Chicago, that felt like a small miracle. I had about six or seven months of R and the <code>{tidyverse}</code> behind me, was still studying social psychology in grad school, and had only ever done academic conferences. Walking into this one felt different; bigger stage, crisper production, and a sense that tooling and community might actually pull in the same direction.</p>
<p><img src="https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/img/signed-hadley.png" class="float-start rounded shadow-sm img-fluid" style="width:33.0%" alt="Photograph of a signed copy of the book R for Data Science by Hadley Wickham, obtained at posit::conf 2018 in San Diego"></p>
<p>I went with my closest friend from grad school, who’d already transitioned into data science. Everyone I met was working in ML, deep learning, or advanced quant at huge companies I’d only seen on slides. But these folks were also disarmingly kind. When I said I was a psychologist, often labeled a “soft science” in tech circles, no one waved me off. Instead, they just asked how I used R in my work. I talked about group decision making and computational modeling. People’s eyes widened. They leaned in. That surprised me, and it stuck.</p>
<p>The talks were practical and alive: Live demos, tools I could try immediately. I left able to build my first website with <code>{blogdown}</code> (which I <em>just</em> migrated from yesterday; and this is my first post with new Quarto setup). I also met Hadley and got my copy of <em>R for Data Science</em> signed. Total fanboy moment, noted and accepted.</p>
</section>
<section id="the-click-where-my-work-meets-my-values" class="level2">
<h2 class="anchored" data-anchor-id="the-click-where-my-work-meets-my-values">The Click: Where My Work Meets My Values</h2>
<p>My academic work looked at group decision making as a system of information processing. That lens made me care about how evidence actually moves through a pipeline: methods, statistics, and whether the work is verifiable by someone who is not me.</p>
<p>Very quickly, I realized three ideas were inseparable for me: open science, open source, and open access. Not as slogans, as habits. That looks like preregistration and reproducible code, but also public issue threads, code review in daylight, and packages stewarded by a community instead of a gatekeeper. R and Posit folks treated these not as extras but as table stakes. That alignment shaped my path toward building systems that lower friction and widen access for other people.</p>
<p><strong>What open science and open source share in practice (how I try to work):</strong></p>
<ul>
<li>Make claims traceable: Analysis scripts, data lineage, and assumptions live with the work</li>
<li>Prefer repair to perfection: Log issues, document tradeoffs, and improve in public</li>
<li>Share context, not just code: Explain what a function is for and how to use it responsibly</li>
<li>Design for stewardship: Make it easy for someone else to keep the lights on after you</li>
</ul>
</section>
<section id="why-i-keep-coming-back" class="level2">
<h2 class="anchored" data-anchor-id="why-i-keep-coming-back">Why I Keep Coming Back</h2>
<p>I keep returning because the technical depth is matched by something rarer: A culture that is welcoming without theater. You can engage on your own terms. I’ve given talks at other conferences, but at posit::conf, I mostly come to learn, listen, and connect. The constant has always been folks who treat curiosity as a shared resource.</p>
<p>I’ve read a lot of posts about people’s posit::conf experiences over the years, and they all match exactly what I’ve felt since 2018: belonging, room to participate at your own pace, serious inspiration, hands-on collaboration, and small moments of joy that matter more because of the people you share them with.</p>
<p>If you know, you know. And if you don’t, I hope you join us to find out.</p>
</section>
<section id="from-audience-to-speaker" class="level2">
<h2 class="anchored" data-anchor-id="from-audience-to-speaker">From Audience to Speaker</h2>
<p>Last week, I stepped onto the posit::conf stage for the first time. Just being accepted was an honor. Every year there are more excellent proposals than slots, so a “yes” signals the committee is trusting you to carry something useful onto that stage. And I was there to share something I care deeply about: the intersection of behavioral science and tool design.</p>
<p>I spoke about bringing these ideas to Shiny so teams can design apps that reduce cognitive friction instead of creating it, to think about cognitive engineering as much as we do performance engineering. Think fewer “data dumps” with dozens of dropdowns and more guided flows that surface the right decision at the right moment. Better products for users, fewer support tickets for teams.</p>
<p><img src="https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/img/posit-stage.png" class="float-end rounded shadow-sm img-fluid" style="width:65.0%" alt="Speaker on stage at posit::conf 2025 presenting to an audience, speaking about behavioral science and tool design for improving user experience in software development"></p>
<p>I was anxious the whole week leading up to my talk, which isn’t unusual since I often feel that way before presenting. I’ve spoken at other conferences, led workshops, and taught college courses, so public speaking is familiar territory. But this felt different. The moment I walked on stage and brought up my slides, the anxiety disappeared! I had <em>never</em> experienced that before. It felt like a conversation with friends who care about good tools and solid evidence. The examples landed, the audience leaned in, and the Q&amp;A could have gone on much longer if we hadn’t run out of time.</p>
<p>It was easily the most fun I’ve ever had giving a talk; truly an experience I’ll carry with me for a long time. The best part might have been the hallway follow-ups: quick notebook sketches, a few “this changes how I think about building dashboards” epiphanies, and two separate chats about adapting the ideas for decision-making research.</p>
<p>I’m so grateful to the Posit team for creating conditions where growth is possible, and to this polyglot community for making it feel like home. The spirit that met me in 2018 is still doing profound, durable work.</p>
</section>
<section id="a-day-contributing-tidyverse-developer-day" class="level2">
<h2 class="anchored" data-anchor-id="a-day-contributing-tidyverse-developer-day">A Day Contributing: Tidyverse Developer Day</h2>
<p>After the main program, I spent most of Friday at Tidyverse Developer Day: A room full of people contributing to <code>{tidyverse}</code> packages with the authors and maintainers working alongside them. The team curates issues for all skill levels, with helpers floating around to unblock any hiccups that might come up.</p>
<div class="float-end callout callout-style-default callout-tip callout-titled" title="Tidyverse Developer Day (TDD) resources:" width="50%">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>Tidyverse Developer Day (TDD) resources:
</div>
</div>
<div class="callout-body-container callout-body">
<ul>
<li><a href="https://www.tidyverse.org/blog/2025/07/tdd-2025/">TDD 2025 info</a></li>
<li><a href="https://github.com/tidyverse/tidy-dev-day">TDD repo</a></li>
<li><a href="https://github.com/tidyverse/tidy-dev-day/blob/main/CODE_OF_CONDUCT.md">Code of Conduct</a></li>
</ul>
</div>
</div>
<p>It’s a welcoming space for a topic that can feel high-friction the first time you try it. The bar to contribute drops incredibly fast when someone says, “let’s open the issue and look together”. This year, Joe Cheng was at my table, a genuine delight for this avid Shiny user.</p>
<p>I <strong>strongly suspect</strong> days like this compound: first PRs become second and third ones, kind reviews early on create the next round of kind reviewers, and the effect spills into broader open source ecosystems.</p>
</section>
<section id="what-i-took-home-this-time" class="level2">
<h2 class="anchored" data-anchor-id="what-i-took-home-this-time">What I Took Home This Time</h2>
<ul>
<li>A <strong>clearer north star</strong> for my work: Build systems that help reduce friction, support better choices, and promote equitable collaboration so the next person/team moves faster.</li>
<li>A reminder that <strong>precision and kindness are not in tension</strong>: You can ask for evidence and still be generous. They are not mutually exclusive but mutually necessary for progress.</li>
<li>A short list of <strong>things next</strong>: Deepen Rust knowledge, keep improving <code>{bidux}</code>, continue bridging behavioral science and tech, and always strive to meet people where they are.</li>
</ul>
</section>
<section id="looking-ahead" class="level2">
<h2 class="anchored" data-anchor-id="looking-ahead">Looking Ahead</h2>
<p>Seven years later, I still leave recharged and focused. My plan is simple: Pay forward the clarity and encouragement this community gave me. Build things that make it easier for the next person to do good work.</p>
<p>If that sounds like your kind of place, I look forward to seeing you there!</p>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2025,
  author = {},
  title = {From {First-Time} {Attendee} to {Speaker:} {My} Posit::conf
    {Journey}},
  date = {2025-09-25},
  url = {https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2025" class="csl-entry quarto-appendix-citeas">
<span>“From First-Time Attendee to Speaker: My Posit::conf
Journey.”</span> 2025. September 25, 2025. <a href="https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/">https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/</a>.
</div></div></section></div> ]]></description>
  <category>Education &amp; Community</category>
  <category>Behavioral Science</category>
  <guid>https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/</guid>
  <pubDate>Thu, 25 Sep 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2025-09-25_posit-conf-journey/featured.png" medium="image" type="image/png" height="73" width="144"/>
</item>
<item>
  <title>{bidux} v0.3.1: Modern Telemetry Integration in the BID Framework</title>
  <link>https://www.jrwinget.com/blog/2025-09-08_modern-telemetry/</link>
  <description><![CDATA[ 




<p>I’m excited to announce that <strong>bidux v0.3.1</strong> is now available on CRAN. This release brings telemetry deeper into the Behavioral Insight Design (BID) framework, strengthens stage consistency, and smooths the migration path for users moving from earlier versions.</p>
<section id="a-modern-telemetry-workflow" class="level2">
<h2 class="anchored" data-anchor-id="a-modern-telemetry-workflow">A Modern Telemetry Workflow</h2>
<p>Telemetry data is increasingly central to BID workflows. Until now, telemetry ingestion produced list-like objects that worked but weren’t always convenient for analysis.</p>
<p>In <strong>v0.3.1</strong>, <code>bid_ingest_telemetry()</code> now returns <strong>hybrid telemetry objects</strong>: they behave like lists for backward compatibility, while adding new methods such as <code>as_tibble()</code> and <code>bid_flags()</code>.</p>
<p>For new projects, the preferred interface is the tidy-friendly <code>bid_telemetry()</code>, which returns structured tibbles (<code>bid_issues_tbl</code>) ready for <code>dplyr</code> pipelines:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1">issues <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_telemetry</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"telemetry.sqlite"</span>)</span>
<span id="cb1-2">critical <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> issues <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(severity <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"critical"</span>)</span></code></pre></div></div>
</section>
<section id="from-telemetry-to-bid-stages" class="level2">
<h2 class="anchored" data-anchor-id="from-telemetry-to-bid-stages">From Telemetry to BID Stages</h2>
<p>This release also introduces <strong>bridge functions</strong> that connect telemetry issues directly to BID stages:</p>
<ul>
<li><code>bid_notice_issue()</code>: convert one issue into a Notice stage</li>
<li><code>bid_notices()</code>: batch process multiple issues</li>
<li><code>bid_address()</code>: quickly act on a set of issues</li>
<li><code>bid_pipeline()</code>: process the first <em>N</em> issues with limits</li>
</ul>
<p>These functions make it easier to move from raw telemetry data to structured BID insights without extra glue code.</p>
</section>
<section id="a-quick-example" class="level2">
<h2 class="anchored" data-anchor-id="a-quick-example">A Quick Example</h2>
<p>Here’s what a telemetry-informed BID pipeline can look like in <strong>v0.3.1</strong>:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(bidux)</span>
<span id="cb2-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(dplyr)</span>
<span id="cb2-3"></span>
<span id="cb2-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Ingest telemetry as tidy issues</span></span>
<span id="cb2-5">issues <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_telemetry</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"telemetry.sqlite"</span>)</span>
<span id="cb2-6"></span>
<span id="cb2-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Focus on the most severe issues</span></span>
<span id="cb2-8">critical <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> issues <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb2-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(severity <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"critical"</span>)</span>
<span id="cb2-10"></span>
<span id="cb2-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Turn those into BID notices</span></span>
<span id="cb2-12">notices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_notices</span>(critical)</span>
<span id="cb2-13"></span>
<span id="cb2-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Move forward through the pipeline</span></span>
<span id="cb2-15">structure <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_structure</span>(notices, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">telemetry_flags =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_flags</span>(critical))</span>
<span id="cb2-16">validate  <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_validate</span>(structure, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">telemetry_refs =</span> critical)</span></code></pre></div></div>
</details>
</div>
<p>This pipeline takes real user behavior (via telemetry), surfaces critical issues, and carries them through to <strong>structured design recommendations</strong> and <strong>validation</strong>, all with tidy, pipeline-friendly functions.</p>
</section>
<section id="framework-stability-and-improvements" class="level2">
<h2 class="anchored" data-anchor-id="framework-stability-and-improvements">Framework Stability and Improvements</h2>
<p>Alongside telemetry, this release improves framework stability and consistency:</p>
<ul>
<li><strong>Stage order correction</strong>:
<ul>
<li>BID stages now follow the canonical sequence: <em>Interpret → Notice → Anticipate → Structure → Validate</em></li>
<li>Migration support is built in, so older outputs remain interpretable</li>
</ul></li>
</ul>
<p><img src="https://www.jrwinget.com/blog/2025-09-08_modern-telemetry/img/bid-framework.png" class="img-fluid" alt="Behavioral Insight Design (BID) framework diagram showing five sequential stages: Interpret User Needs, Notice the Problem, Anticipate User Behavior, Structure the App, and Validate and Empower the User"></p>
<ul>
<li><strong>Unified suggestion system</strong>:
<ul>
<li>Interpret, Notice, and Validate now share consistent rules, reducing duplication and simplifying outputs</li>
</ul></li>
<li><strong>Structure and validation refinements</strong>:
<ul>
<li><code>bid_structure()</code> can adjust recommendations using telemetry flags (e.g., avoid tabs if navigation drop-offs are detected)</li>
<li><code>bid_validate()</code> can link validation results back to the telemetry issues that motivated them</li>
</ul></li>
</ul>
</section>
<section id="migration-and-deprecations" class="level2">
<h2 class="anchored" data-anchor-id="migration-and-deprecations">Migration and Deprecations</h2>
<p>Existing telemetry code continues to work, but now has access to the enhanced features. Looking ahead to <strong>0.4.0</strong>:</p>
<ul>
<li><code>bid_ingest_telemetry()</code> will be soft-deprecated in favor of <code>bid_telemetry()</code>.</li>
<li>Layout auto-selection in <code>bid_structure()</code> and layout-specific bias mitigations in <code>bid_anticipate()</code> will be removed in favor of concept-driven approaches.</li>
</ul>
</section>
<section id="documentation-and-tests" class="level2">
<h2 class="anchored" data-anchor-id="documentation-and-tests">Documentation and Tests</h2>
<p>We’ve updated the <strong>Getting Started</strong> vignette with new telemetry examples and corrected stage numbering, and expanded the <strong>Telemetry Integration</strong> vignette to show the new tidy workflow. Test coverage has been broadened across new methods, bridge functions, and migration utilities.</p>
</section>
<section id="install-the-latest-release" class="level2">
<h2 class="anchored" data-anchor-id="install-the-latest-release">Install the Latest Release</h2>
<p>You can install <strong>bidux v0.3.1</strong> from CRAN today:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bidux"</span>)</span></code></pre></div></div>
</details>
</div>
<p>This release makes working with telemetry in BID smoother, tidier, and more consistent. It strengthens the framework while keeping migration straightforward, so you can spend less time adapting code and more time focusing on insights.</p>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2025,
  author = {},
  title = {`\{Bidux\}` V0.3.1: {Modern} {Telemetry} {Integration} in the
    {BID} {Framework}},
  date = {2025-09-08},
  url = {https://www.jrwinget.com/blog/2025-09-08_modern-telemetry/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2025" class="csl-entry quarto-appendix-citeas">
<span>“`{Bidux}` V0.3.1: Modern Telemetry Integration in the BID
Framework.”</span> 2025. September 8, 2025. <a href="https://www.jrwinget.com/blog/2025-09-08_modern-telemetry/">https://www.jrwinget.com/blog/2025-09-08_modern-telemetry/</a>.
</div></div></section></div> ]]></description>
  <category>Software Development</category>
  <category>User Experience</category>
  <guid>https://www.jrwinget.com/blog/2025-09-08_modern-telemetry/</guid>
  <pubDate>Mon, 08 Sep 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2025-09-08_modern-telemetry/featured.png" medium="image" type="image/png" height="167" width="144"/>
</item>
<item>
  <title>From Friction to Flow: Designing Smarter Dashboards with {bidux}</title>
  <link>https://www.jrwinget.com/blog/2025-06-19_from-friction-to-flow/</link>
  <description><![CDATA[ 




<blockquote class="blockquote">
<p>🚀 <code>{bidux}</code> v0.1.0 is live on CRAN!<br> Users don’t just see your dashboard: They interpret it, navigate it, and act (or fail to act) based on it. That’s not just design; it’s cognition.<br> Learn more below or explore the repo: <a href="https://github.com/jrwinget/bidux">github.com/jrwinget/bidux</a></p>
</blockquote>
<section id="why-ux-is-too-often-an-afterthought" class="level2">
<h2 class="anchored" data-anchor-id="why-ux-is-too-often-an-afterthought">Why UX Is Too Often an Afterthought</h2>
<p>In data analytics and dashboard development, we often get the logic right: The data is clean, the calculations are correct, and the visualizations are technically sound. But when the dashboard goes live, something breaks.</p>
<p>Users ignore insights. Or worse, they disengage entirely.</p>
<p>This isn’t a reflection of poor technical work. It’s a sign that something deeper is missing: A bridge between how we design interfaces and how people actually think.</p>
<p>Most developers aren’t trained in psychology or UX. That’s not a failing; it’s a gap in the pipeline. Users interpret our tools through human lenses: They carry cognitive limitations, rely on mental shortcuts, and make judgments shaped by emotion, context, and bias.</p>
<p>To design dashboards that not only work, but <em>resonate</em>, we need to design for the mind, not just the machine.</p>
</section>
<section id="a-new-starting-point-bid-bidux" class="level2">
<h2 class="anchored" data-anchor-id="a-new-starting-point-bid-bidux">A New Starting Point: BID + <code>{bidux}</code></h2>
<p>The <strong>Behavior Insight Design (BID)</strong> framework offers a structured, evidence-based approach to building more intuitive, cognitively supportive dashboards. Developed at the intersection of behavioral science, data storytelling, and interface design, BID maps out five stages that reflect how users actually engage with information:</p>
<p><img src="https://www.jrwinget.com/blog/2025-06-19_from-friction-to-flow/img/bid-framework.png" class="img-fluid" alt="Behavioral Insight Design (BID) framework diagram showing five sequential stages: Interpret User Needs, Notice the Problem, Anticipate User Behavior, Structure the App, and Validate and Empower the User"></p>
<p>Each stage helps developers reduce friction, surface insight, and support better decision-making, even without a background in psychology or UX.</p>
<p>The <code>{bidux}</code> R package makes this framework practical for developers. It provides a step-by-step workflow, concept dictionaries, and component suggestions that bring behavioral design directly into your Shiny development process.</p>
<p>Together, BID and <code>{bidux}</code> help you turn psychological friction into flow.</p>
</section>
<section id="what-makes-bid-different" class="level2">
<h2 class="anchored" data-anchor-id="what-makes-bid-different">What Makes BID Different?</h2>
<p>Most design systems focus on aesthetics or layout heuristics. BID starts earlier by asking what your user <em>needs to think, feel, and decide</em> at each stage of interaction.</p>
<p>BID is grounded in cognitive psychology, decision science, and information processing theory. It doesn’t just tell you what to build; it explains <em>why</em> certain design choices succeed or fail, based on decades of empirical research.</p>
<p>And while other frameworks often isolate usability issues or user behavior, BID treats these mechanisms as dynamically linked. For instance:</p>
<ul>
<li>How <strong>cognitive load</strong> affects susceptibility to bias</li>
<li>How early layout decisions shape later interpretations</li>
<li>How interface design impacts individual and group coordination</li>
</ul>
<p><code>{bidux}</code> brings this theory into practice, giving you the tools to identify friction points, document key decisions, and structure your dashboard with the user’s cognition in mind.</p>
</section>
<section id="the-five-bid-stages-with-bidux-examples" class="level2">
<h2 class="anchored" data-anchor-id="the-five-bid-stages-with-bidux-examples">The Five BID Stages (with <code>{bidux}</code> Examples)</h2>
<section id="interpret-the-users-needs" class="level3">
<h3 class="anchored" data-anchor-id="interpret-the-users-needs">1️⃣ <strong>Interpret the User’s Needs</strong></h3>
<p><strong>Goal</strong>: Center your design around the core questions users are trying to answer.</p>
<blockquote class="blockquote">
<p>Users don’t want every chart: They want clarity about what matters, and often more importantly, what to do about it.</p>
</blockquote>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1">stg_interpret <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_interpret</span>(</span>
<span id="cb1-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">central_question =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Where are sales underperforming?"</span>,</span>
<span id="cb1-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data_story =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">hook =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">tension =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resolution =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>)</span>
<span id="cb1-4">)</span>
<span id="cb1-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: ! Using deprecated list format for data_story parameter</span></span>
<span id="cb1-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Please use new_data_story() constructor for new code</span></span>
<span id="cb1-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Legacy format will be automatically migrated</span></span>
<span id="cb1-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: ! Using deprecated nested format for data_story</span></span>
<span id="cb1-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ The flat API is now recommended: new_data_story(hook, context, tension,</span></span>
<span id="cb1-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   resolution)</span></span>
<span id="cb1-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Nested format (variables, relationships) will be removed in bidux 0.4.0</span></span>
<span id="cb1-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 1 (Interpret) completed.</span></span>
<span id="cb1-13"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Central question: Where are sales underperforming?</span></span>
<span id="cb1-14"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Your data story is incomplete (25%). Consider adding these missing elements: hook, tension, resolution.</span></span>
<span id="cb1-15"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Your central question is appropriately scoped.</span></span>
<span id="cb1-16"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - No user personas defined</span></span></code></pre></div></div>
</details>
</div>
<p><strong>Try</strong>: Structuring the app like a narrative; defining user personas to anchor your story.</p>
</section>
<section id="notice-the-problem" class="level3">
<h3 class="anchored" data-anchor-id="notice-the-problem">2️⃣ <strong>Notice the Problem</strong></h3>
<p><strong>Goal</strong>: Identify where users struggle cognitively, visually, or emotionally.</p>
<blockquote class="blockquote">
<p>Most friction stems from overload: Too many filters, unclear hierarchies, or competing focal points.</p>
</blockquote>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># install.packages("bidux")</span></span>
<span id="cb2-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># library(bidux)</span></span>
<span id="cb2-3"></span>
<span id="cb2-4">stg_notice <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_notice</span>(</span>
<span id="cb2-5">  stg_interpret,</span>
<span id="cb2-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">problem =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Users can't find the key metrics"</span>,</span>
<span id="cb2-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">evidence =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"70% of testers took &gt;30s locating the primary KPI"</span></span>
<span id="cb2-8">)</span>
<span id="cb2-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Auto-suggested theory: Cognitive Load Theory (confidence: 90%)</span></span>
<span id="cb2-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 2 (Notice) completed. (40% complete)</span></span>
<span id="cb2-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Problem: Users can't find the key metrics</span></span>
<span id="cb2-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Theory: Cognitive Load Theory (auto-suggested)</span></span>
<span id="cb2-13"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Evidence: 70% of testers took &gt;30s locating the primary KPI</span></span>
<span id="cb2-14"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Theory confidence: 90%</span></span>
<span id="cb2-15"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Next: Use bid_anticipate() for Stage 3</span></span></code></pre></div></div>
</details>
</div>
<p><strong>Try</strong>: Swapping dropdowns for grouped radio buttons; surfacing KPIs higher in the layout.</p>
</section>
<section id="anticipate-user-behavior" class="level3">
<h3 class="anchored" data-anchor-id="anticipate-user-behavior">3️⃣ <strong>Anticipate User Behavior</strong></h3>
<p><strong>Goal</strong>: Account for predictable biases in how users interpret and interact with data.</p>
<blockquote class="blockquote">
<p>People often anchor on the first number they see, seek out confirming evidence, and interpret identical outcomes differently depending on how they’re framed</p>
</blockquote>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1">stg_anticipate <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_anticipate</span>(stg_notice)</span>
<span id="cb3-2"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 3 (Anticipate) completed.</span></span>
<span id="cb3-3"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Bias mitigations: 4 defined</span></span>
<span id="cb3-4"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Accessibility considerations included</span></span>
<span id="cb3-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Key suggestions: Anchoring mitigation: Always show reference points like previous period, budget, or industry average, Framing mitigation: Toggle between progress (65% complete) and gap (35% remaining) framing, Confirmation Bias mitigation: Include alternative views that might challenge the main narrative</span></span></code></pre></div></div>
</details>
</div>
<p><strong>Try</strong>: Showing both “65% complete” and “35% remaining”; providing scenario toggles to challenge assumptions.</p>
</section>
<section id="structure-the-app" class="level3">
<h3 class="anchored" data-anchor-id="structure-the-app">4️⃣ <strong>Structure the App</strong></h3>
<p><strong>Goal</strong>: Organize layout and flow to reduce decision errors and guide attention.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1">stg_structure <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_structure</span>(stg_anticipate)</span>
<span id="cb4-2"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: Layout auto-selection is deprecated and will be removed in bidux</span></span>
<span id="cb4-3"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 0.4.0. The BID framework will focus on concept-based suggestions instead.</span></span>
<span id="cb4-4"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Existing code will continue to work until 0.4.0.</span></span>
<span id="cb4-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 4 (Structure) completed.</span></span>
<span id="cb4-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Auto-selected layout: breathable</span></span>
<span id="cb4-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Concept groups generated: 3</span></span>
<span id="cb4-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Total concepts: 3</span></span></code></pre></div></div>
</details>
</div>
<p><strong>Try</strong>: Grouping filters near relevant charts; using <code>bslib::card_body(padding = 4)</code> for visual spacing.</p>
</section>
<section id="validate-empower-the-user" class="level3">
<h3 class="anchored" data-anchor-id="validate-empower-the-user">5️⃣ <strong>Validate &amp; Empower the User</strong></h3>
<p><strong>Goal</strong>: Reinforce clarity, support confidence, and enable action (individually or as a team).</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1">stg_validate <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_validate</span>(stg_anticipate)</span>
<span id="cb5-2"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 5 (Validate) completed.</span></span>
<span id="cb5-3"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Summary panel: Dashboard provides clear summary of key insight...</span></span>
<span id="cb5-4"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Collaboration: Enable team sharing and collaborative decision-...</span></span>
<span id="cb5-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Next steps: 11 items defined</span></span>
<span id="cb5-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Consider adding user empowerment tools to enhance collaboration</span></span></code></pre></div></div>
</details>
</div>
<p><strong>Try</strong>: Ending with a clear takeaway panel; adding next-step checklists or shareable reports.</p>
</section>
</section>
<section id="try-it-out" class="level2">
<h2 class="anchored" data-anchor-id="try-it-out">Try It Out</h2>
<p>Example flow:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(bidux)</span>
<span id="cb6-2"></span>
<span id="cb6-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Align the dashboard with user goals</span></span>
<span id="cb6-4">workflow <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_interpret</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">central_question =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Which products underperformed?"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb6-5"></span>
<span id="cb6-6">  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Identifying user friction point</span></span>
<span id="cb6-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_notice</span>(</span>
<span id="cb6-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">problem =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Too many filters"</span>,</span>
<span id="cb6-9">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">evidence =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Users forget the currently selected options"</span></span>
<span id="cb6-10">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb6-11"></span>
<span id="cb6-12">  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Address predictable cognitive biases (leave blank for auto-suggestions)</span></span>
<span id="cb6-13">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_anticipate</span>(</span>
<span id="cb6-14">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bias_mitigations =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(</span>
<span id="cb6-15">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">anchoring =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Include prior year as reference"</span>,</span>
<span id="cb6-16">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">confirmation_bias =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Show competing explanations"</span></span>
<span id="cb6-17">    )</span>
<span id="cb6-18">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb6-19"></span>
<span id="cb6-20">  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Design information structure</span></span>
<span id="cb6-21">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_structure</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb6-22"></span>
<span id="cb6-23">  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Reinforce clarity and enable collaboration</span></span>
<span id="cb6-24">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_validate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">summary_panel =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Key takeaways with export"</span>)</span>
<span id="cb6-25"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: ! Using deprecated nested format for data_story</span></span>
<span id="cb6-26"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ The flat API is now recommended: new_data_story(hook, context, tension,</span></span>
<span id="cb6-27"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   resolution)</span></span>
<span id="cb6-28"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Nested format (variables, relationships) will be removed in bidux 0.4.0</span></span>
<span id="cb6-29"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 1 (Interpret) completed.</span></span>
<span id="cb6-30"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Central question: Which products underperformed?</span></span>
<span id="cb6-31"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Your data story is incomplete (25%). Consider adding these missing elements: hook, tension, resolution.</span></span>
<span id="cb6-32"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Your central question is appropriately scoped.</span></span>
<span id="cb6-33"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - No user personas defined </span></span>
<span id="cb6-34"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Auto-suggested theory: Hick's Law (confidence: 90%)</span></span>
<span id="cb6-35"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 2 (Notice) completed. (40% complete)</span></span>
<span id="cb6-36"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Problem: Too many filters</span></span>
<span id="cb6-37"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Theory: Hick's Law (auto-suggested)</span></span>
<span id="cb6-38"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Evidence: Users forget the currently selected options</span></span>
<span id="cb6-39"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Theory confidence: 90%</span></span>
<span id="cb6-40"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Next: Use bid_anticipate() for Stage 3</span></span>
<span id="cb6-41"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: ! Using deprecated list format for bias_mitigations parameter</span></span>
<span id="cb6-42"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Please use new_bias_mitigations() constructor for new code</span></span>
<span id="cb6-43"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Legacy format will be automatically migrated</span></span>
<span id="cb6-44"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 3 (Anticipate) completed.</span></span>
<span id="cb6-45"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Bias mitigations: 4 defined</span></span>
<span id="cb6-46"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Accessibility considerations included</span></span>
<span id="cb6-47"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Key suggestions: Bias_type mitigation: Consider how this bias affects user decisions, Mitigation_strategy mitigation: Consider how this bias affects user decisions, Confidence_level mitigation: Consider how this bias affects user decisions</span></span>
<span id="cb6-48"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: Layout auto-selection is deprecated and will be removed in bidux</span></span>
<span id="cb6-49"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 0.4.0. The BID framework will focus on concept-based suggestions instead.</span></span>
<span id="cb6-50"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Existing code will continue to work until 0.4.0.</span></span>
<span id="cb6-51"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 4 (Structure) completed.</span></span>
<span id="cb6-52"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Auto-selected layout: breathable</span></span>
<span id="cb6-53"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Concept groups generated: 3</span></span>
<span id="cb6-54"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Total concepts: 3</span></span>
<span id="cb6-55"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Stage 5 (Validate) completed.</span></span>
<span id="cb6-56"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Summary panel: Key takeaways with export</span></span>
<span id="cb6-57"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Collaboration: Enable team sharing and collaborative decision-...</span></span>
<span id="cb6-58"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Next steps: 11 items defined</span></span>
<span id="cb6-59"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   - Ensure summary panel includes actionable insights Consider adding user empowerment tools to enhance collaboration</span></span></code></pre></div></div>
</details>
</div>
<p>Then explore suggestions:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bid_suggest_components</span>(workflow, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bslib"</span>)</span>
<span id="cb7-2"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # A tibble: 8 × 7</span></span>
<span id="cb7-3"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   package component description bid_stage_relevance cognitive_concepts use_cases</span></span>
<span id="cb7-4"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   &lt;chr&gt;   &lt;chr&gt;     &lt;chr&gt;       &lt;chr&gt;               &lt;chr&gt;              &lt;chr&gt;    </span></span>
<span id="cb7-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 1 bslib   nav_panel Create tab… Stage 1,Stage 3     Cognitive Load Th… content …</span></span>
<span id="cb7-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 2 bslib   accordion Implement … Stage 1,Stage 3     Progressive Discl… FAQ sect…</span></span>
<span id="cb7-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 3 bslib   value_box Display ke… Stage 2,Stage 3,St… Visual Hierarchy,… KPI disp…</span></span>
<span id="cb7-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 4 bslib   card      Organize c… Stage 3,Stage 5     Principle of Prox… Content …</span></span>
<span id="cb7-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 5 bslib   layout_c… Create res… Stage 3             Cognitive Load Th… responsi…</span></span>
<span id="cb7-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 6 bslib   layout_c… Create fle… Stage 3             Visual Hierarchy,… precise …</span></span>
<span id="cb7-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 7 bslib   card_hea… Add header… Stage 3             Information Hiera… section …</span></span>
<span id="cb7-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 8 bslib   card_body Control ca… Stage 3             Breathable Layout… content …</span></span>
<span id="cb7-13"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 1 more variable: relevance &lt;dbl&gt;</span></span></code></pre></div></div>
</details>
</div>
</section>
<section id="learn-more" class="level2">
<h2 class="anchored" data-anchor-id="learn-more">Learn More</h2>
<ul>
<li>📦 GitHub: <a href="https://github.com/jrwinget/bidux">jrwinget/bidux</a></li>
<li>📚 Vignettes: <code>introduction-to-bid</code>, <code>concepts-reference</code>, <code>getting-started</code></li>
<li>👀 Theory paper coming soon! <a href="https://github.com/jrwinget/bid-framework">Watch this space</a></li>
</ul>
</section>
<section id="final-thoughts" class="level2">
<h2 class="anchored" data-anchor-id="final-thoughts">Final Thoughts</h2>
<p>Better dashboards aren’t just more attractive; they’re more <em>intelligible</em>. They reduce unnecessary decisions, anticipate user confusion, and surface what matters.</p>
<p>The BID framework gives developers a behavioral lens. <code>{bidux}</code> gives them the tools to act on it.</p>
<p>You don’t need a psychology degree to design with cognitive empathy, just the right questions and the right support. <code>{bidux}</code> helps you ask both.</p>
<p>Happy designing! 🚀</p>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2025,
  author = {},
  title = {From {Friction} to {Flow:} {Designing} {Smarter} {Dashboards}
    with `\{Bidux\}`},
  date = {2025-06-19},
  url = {https://www.jrwinget.com/blog/2025-06-19_from-friction-to-flow/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2025" class="csl-entry quarto-appendix-citeas">
<span>“From Friction to Flow: Designing Smarter Dashboards with
`{Bidux}`.”</span> 2025. June 19, 2025. <a href="https://www.jrwinget.com/blog/2025-06-19_from-friction-to-flow/">https://www.jrwinget.com/blog/2025-06-19_from-friction-to-flow/</a>.
</div></div></section></div> ]]></description>
  <category>Software Development</category>
  <category>User Experience</category>
  <guid>https://www.jrwinget.com/blog/2025-06-19_from-friction-to-flow/</guid>
  <pubDate>Thu, 19 Jun 2025 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2025-06-19_from-friction-to-flow/featured.png" medium="image" type="image/png" height="167" width="144"/>
</item>
<item>
  <title>Predicting pileups: Using ML to predict Chicago crash types</title>
  <link>https://www.jrwinget.com/blog/2021-12-18_predicting-pileups/</link>
  <description><![CDATA[ 




<p>Wouldn’t it be great to know the chances of being injured or having your car totaled in a car accident <em>before</em> ever being involved in the accident? If you live in a less densely populated area, this question might not be very interesting. But if you live in a big city, like Chicago, you might be a little more concerned about this. By identifying factors that predict these bad accidents, we might be able to develop low cost interventions or redesign the environment to reduce the frequency of these types of accidents, which can translate to lives and money saved. So in this post, we’ll use machine learning to predict which traffic crashes in Chicago, IL, result in injuries and/or the vehicle being towed based on situational features (e.g., posted speed limit, lighting conditions, road surface condition, etc.) that are likely known before the crash occurred.</p>
<p><a href="https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if">Traffic crash</a> data were imported from the <a href="https://data.cityofchicago.org/">Chicago Data Portal</a> using the <a href="https://github.com/Chicago/RSocrata">RSocrata</a> package. The City of Chicago also has data sets related to the <a href="https://data.cityofchicago.org/Transportation/Traffic-Crashes-Vehicles/68nd-jvt3">vehicles</a> and <a href="https://data.cityofchicago.org/Transportation/Traffic-Crashes-People/u6pd-qa9d">people</a> involved in these crashes, but to keep things simple for now, let’s focus on a few situational factors related to the crash, all of which can be found in the main data set.</p>
<p>These data show information about each traffic crash on city streets within the City of Chicago and under the jurisdiction of the Chicago Police Department. Many of the variables (e.g.&nbsp;street conditions, weather conditions, etc.) are recorded by the reporting officer and are based on the best available information at the time, but according to the Chicago Data Portal, these data may disagree with other posted information. As such, data are subject to change based on new information.</p>
<section id="import-data" class="level2">
<h2 class="anchored" data-anchor-id="import-data">Import data</h2>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20211218</span>)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(RSocrata)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span>
<span id="cb1-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(lubridate)</span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(janitor)</span>
<span id="cb1-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidymodels)</span>
<span id="cb1-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(vip)</span>
<span id="cb1-8"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(skimr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">include.only =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"skim"</span>)</span>
<span id="cb1-9"></span>
<span id="cb1-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># data pulled at time of post; new cases likely added to data portal since then</span></span>
<span id="cb1-11">crashes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read.socrata</span>(</span>
<span id="cb1-12">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"https://data.cityofchicago.org/resource/85ca-t3if.csv"</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># url of data set</span></span>
<span id="cb1-13">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">app_token =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Sys.getenv</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsocrata_token"</span>)                 <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># my personal creds</span></span>
<span id="cb1-14">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb1-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">clean_names</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb1-16">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(</span>
<span id="cb1-17">    crash_type, crash_date, posted_speed_limit, traffic_control_device,</span>
<span id="cb1-18">    device_condition, weather_condition, lighting_condition, first_crash_type,</span>
<span id="cb1-19">    trafficway_type, alignment, roadway_surface_cond, road_defect, prim_contributory_cause</span>
<span id="cb1-20">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb1-21">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(</span>
<span id="cb1-22">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">crash_date =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ymd_hms</span>(crash_date),</span>
<span id="cb1-23">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">date =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_date</span>(crash_date)</span>
<span id="cb1-24">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb1-25">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>crash_date)</span></code></pre></div></div>
</details>
</div>
<p>The goal here is to build a model that predicts whether a crash will result in someone being injured and/or the vehicle being towed using the <code>crash_type</code> variable in the <code>crashes</code> data set.</p>
<p>From the variables listed in the <a href="https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if">Traffic Crash</a> data set, let’s select some that are likely to have been known before the crash occurred. For example, <code>lighting_condition</code> is a variable that records the light condition at the time of the crash. This situational factor is likely to be known (e.g., observed by the driver, reported by others prior to the crash, etc.) before an accident occurs. At the very least, this information has a higher chance of being known prior to a crash compared to something like <code>injuries_total</code>. <code>injuries_total</code> is a variable that records the total number of people who sustained an injury as a result of the accident. Since this is a consequence of a crash, this information can only known after the crash occurs.</p>
<p>With this logic, let’s focus on the following variables from the initial <a href="https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if">Traffic Crash</a> data set:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 33%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Column name</th>
<th style="text-align: center;">Description</th>
<th style="text-align: center;">Type</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;">crash_date</td>
<td style="text-align: center;">Date and time of crash as entered by the reporting officer</td>
<td style="text-align: center;">Date/Time</td>
</tr>
<tr class="even">
<td style="text-align: center;">posted_speed_limit</td>
<td style="text-align: center;">Posted speed limit, as determined by reporting officer</td>
<td style="text-align: center;">Numeric</td>
</tr>
<tr class="odd">
<td style="text-align: center;">traffic_control_device</td>
<td style="text-align: center;">Traffic control device present at crash location, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="even">
<td style="text-align: center;">device_condition</td>
<td style="text-align: center;">Condition of traffic control device, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="odd">
<td style="text-align: center;">weather_condition</td>
<td style="text-align: center;">Weather condition at time of crash, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="even">
<td style="text-align: center;">lighting_condition</td>
<td style="text-align: center;">Light condition at time of crash, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="odd">
<td style="text-align: center;">first_crash_type</td>
<td style="text-align: center;">Type of first collision in crash</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="even">
<td style="text-align: center;">trafficway_type</td>
<td style="text-align: center;">Trafficway type, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="odd">
<td style="text-align: center;">alignment</td>
<td style="text-align: center;">Street alignment at crash location, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="even">
<td style="text-align: center;">roadway_surface_cond</td>
<td style="text-align: center;">Road surface condition, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="odd">
<td style="text-align: center;">road_defect</td>
<td style="text-align: center;">Road defects, as determined by reporting officer</td>
<td style="text-align: center;">Factor</td>
</tr>
<tr class="even">
<td style="text-align: center;">prim_contributory_cause</td>
<td style="text-align: center;">The factor which was most significant in causing the crash, as determined by officer judgment</td>
<td style="text-align: center;">Factor</td>
</tr>
</tbody>
</table>
</section>
<section id="data-exploration-cleaning" class="level2">
<h2 class="anchored" data-anchor-id="data-exploration-cleaning">Data exploration &amp; cleaning</h2>
<p>Now that we have some idea of what we’ll be looking at in our model, let’s get some impressions of the data and see if anything needs to be cleaned up before going further.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glimpse</span>(crashes)</span></code></pre></div></div>
</details>
</div>
<p>The first thing to note is that there are <code>571426</code> observations with <code>13</code> columns in the <code>crashes</code> data set. And other than some potential capitalization issues with the strings, there doesn’t seem to be any obvious issues with the way the data are formatted. We’ll come back to this, but for now, let’s check for missing data.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">colSums</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is.na</span>(crashes))</span></code></pre></div></div>
</details>
</div>
<p>Great, no missing data! This will simplify data preparation later on. Next, we should examine the frequency counts of the variables we’ll use to predict the outcome class.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1">var_freq <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(df) {</span>
<span id="cb4-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(df), <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">count</span>(df, .data[[.x]]))[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>]</span>
<span id="cb4-3">}</span>
<span id="cb4-4"></span>
<span id="cb4-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">var_freq</span>(crashes)</span></code></pre></div></div>
</details>
</div>
<p>Looks like there are some issues that need to be addressed here! First, the responses that are coded for some variables don’t make sense. For example, <code>posted_speed_limit</code> has <code>6967</code> recorded observations for a posted speed limit of 0 mph. Clearly, this is not a legitimate posted speed limit within the City of Chicago (as much as it may feel like it on the Dan Ryan at 5pm). There are some other odd speed limits recorded for this variable as well.</p>
<p>Another issue these frequency counts reveal is that many of the levels within a variable could be grouped together. For example, <code>prim_contributory_cause</code> makes a distinction between disregarding road markings, stop signs, traffic signals, yield signs, and other traffic signs. Instead, these levels could be grouped into a single level called “disregarding signs/markings”.</p>
<p>So, let’s address each of these problems by cleaning up the levels for each variable. And while we’re at it, let’s clean up the format of all of the strings so there is a consistent style (i.e., lower case).</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1">crashes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> crashes <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(</span>
<span id="cb5-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">across</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">where</span>(is.character), <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_to_lower</span>(.)),</span>
<span id="cb5-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">traffic_control_device =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">case_when</span>(</span>
<span id="cb5-5">      traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"railroad crossing gate"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-6">       traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"other railroad crossing"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-7">        traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rr crossing sign"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rr crossing"</span>,</span>
<span id="cb5-8">      traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bicycle crossing sign"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-9">        traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"pedestrian crossing sign"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-10">        traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"school zone"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"pedestrian signs"</span>,</span>
<span id="cb5-11">      traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"flashing control signal"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"stop sign/flasher"</span>,</span>
<span id="cb5-12">      traffic_control_device <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"no passing"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"other warning sign"</span>,</span>
<span id="cb5-13">      <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> traffic_control_device</span>
<span id="cb5-14">    ),</span>
<span id="cb5-15">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">device_condition =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">case_when</span>(</span>
<span id="cb5-16">      device_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"missing"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"no controls"</span>,</span>
<span id="cb5-17">      device_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"worn reflective material"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-18">        device_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"not functioning"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"functioning improperly"</span>,</span>
<span id="cb5-19">      <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> device_condition</span>
<span id="cb5-20">    ),</span>
<span id="cb5-21">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">weather_condition =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">case_when</span>(</span>
<span id="cb5-22">      weather_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blowing sand, soil, dirt"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-23">        weather_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"severe cross wind gate"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-24">        weather_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blowing snow"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blowing debris"</span>,</span>
<span id="cb5-25">      weather_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"freezing rain/drizzle"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-26">        weather_condition <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sleet/hail"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sleet/hail/freezing rain"</span>,</span>
<span id="cb5-27">      <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> weather_condition</span>
<span id="cb5-28">    ),</span>
<span id="cb5-29">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prim_contributory_cause =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">case_when</span>(</span>
<span id="cb5-30">      prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"disregarding other traffic signs"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-31">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"disregarding road markings"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-32">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"disregarding stop sign"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-33">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"disregarding traffic signals"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-34">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"disregarding yield sign"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-35">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"passing stopped school bus"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"disregarding signs/markings"</span>,</span>
<span id="cb5-36">      prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"distraction - from inside vehicle"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-37">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"distraction - from outside vehicle"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-38">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"distraction - other electronic device (navigation device, dvd player, etc.)"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-39">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"cell phone use other than texting"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-40">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"texting"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"distraction"</span>,</span>
<span id="cb5-41">      prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"had been drinking (use when arrest is not made)"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-42">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"under the influence of alcohol/drugs (use when arrest is effected)"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"under the influence"</span>,</span>
<span id="cb5-43">      prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"obstructed crosswalks"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-44">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"vision obscured (signs, tree limbs, buildings, etc.)"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"obstructions"</span>,</span>
<span id="cb5-45">      prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bicycle advancing legally on red light"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-46">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"motorcycle advancing legally on red light"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bike/motorcycle advancing legally on red light"</span>,</span>
<span id="cb5-47">      prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"animal"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-48">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"evasive action due to animal, object, nonmotorist"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"evasive action"</span>,</span>
<span id="cb5-49">      prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"exceeding authorized speed limit"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb5-50">        prim_contributory_cause <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"exceeding safe speed for conditions"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"speeding"</span>,</span>
<span id="cb5-51">      <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> prim_contributory_cause</span>
<span id="cb5-52">    ),</span>
<span id="cb5-53">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">across</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">where</span>(is_character), <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_factor</span>(.))</span>
<span id="cb5-54">  )</span></code></pre></div></div>
</details>
</div>
<p>Great! Now, the data are clean, and we can start thinking about how to set up the model. Let’s take a look at the current data summary.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">skim</span>(crashes)</span></code></pre></div></div>
</details>
</div>
<p>And, let’s take a closer look at the dependent variable, <code>crash_type</code>.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1">crashes <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">count</span>(crash_type) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prop =</span> n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(n))</span></code></pre></div></div>
</details>
</div>
<p>It looks like there might be a bit of imbalance in the data since the class proportions are skewed towards non-injury/drive-away (74.3%) crash types. Imbalance can sometimes lead to problems in an analysis, especially in severe cases of imbalance. Fortunately, there are a few approaches that try to mitigate this issue (e.g., <a href="https://tidymodels.github.io/themis/">themis</a>). For now, let’s just analyze the data as is.</p>
<p>At this point, it would be helpful to know which variables in the <code>crashes</code> data set are associated with the different levels of the dependent variable, <code>crash_type</code>. One quick way of doing this for the numeric predictors is by using box plots.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1">crashes <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> crash_type, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> posted_speed_limit, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> crash_type)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_boxplot</span>()</span></code></pre></div></div>
</details>
</div>
<p>It looks like any differences between the <code>crash_type</code> levels are quite small for <code>posted_speed_limit</code>. So, maybe this variable won’t be so helpful in predicting injuries/towed crash types after all.</p>
<p>Next, let’s check the the relationship between the categorical variables and <code>crash_type</code> using simple counts. We’ll also filter for counts that are at least 1% of the total proportion of observations to get a better idea of the larger data patterns.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1">print_counts <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(.y_var) {</span>
<span id="cb9-2">  y_var <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sym</span>(.y_var)</span>
<span id="cb9-3"></span>
<span id="cb9-4">  crashes <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">count</span>(crash_type, {{y_var}}) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(crash_type) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">percent =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">round_half_up</span>(n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(n) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb9-8">}</span>
<span id="cb9-9"></span>
<span id="cb9-10">y_var <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> crashes <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">where</span>(is.factor), <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>crash_type) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">variable.names</span>()</span>
<span id="cb9-13"></span>
<span id="cb9-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(y_var, print_counts) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(., <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(., percent <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span></code></pre></div></div>
</details>
</div>
<p>It looks like any differences between the two <code>crash_type</code> classes are small for <code>alignment</code> and <code>weather_condition</code> as well. However, because of the small differences across multiple levels of <code>weather_condition</code>, it’s tough to see if really there is a relationship there or not. Another way we can look for differences between two categorical variables is by plotting a heatmap of the frequency counts.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1">crashes <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(crash_type, weather_condition)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_bin2d</span>()</span></code></pre></div></div>
</details>
</div>
<p>Although there are some differences, these appear to be pretty small. So, perhaps weather isn’t important for this model either.</p>
</section>
<section id="data-preparation" class="level2">
<h2 class="anchored" data-anchor-id="data-preparation">Data preparation</h2>
<p>Next, we’ll do a bit of preprocessing before training the models. This is where we’ll handle feature selection, data splitting, feature engineering, feature scaling, and creating the validation set (i.e., resampling).</p>
<p>The first thing we’ll do here is drop the variables that did not seem to have much of a relationship with <code>crash_type</code> during data exploration.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1">crashes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(crashes, <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(posted_speed_limit, weather_condition, alignment))</span></code></pre></div></div>
</details>
</div>
<p>Next, let’s split the single data set into two: a <em>training set</em> and a <em>testing set</em>. A training data set is a data set of examples used during the learning process and is used to fit the models. A test data set is a data set that is independent of the training data set and is used to evaluate the performance of the final model. If a model fit to the training data set also fits the test data set well, we can be confident minimal overfitting has taken place. On the other hand, if the model seems to fit the training set better than the test set, we might have a case of overfitting.</p>
<p>For a data splitting strategy, let’s set aside 25% of the data for the test set. Since the outcome variable (<code>crash_type</code>) is somewhat imbalanced, we’ll also use a stratified random sample.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1">crash_split <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">initial_split</span>(crashes, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">strata =</span> crash_type)</span>
<span id="cb12-2">crash_train <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">training</span>(crash_split)</span>
<span id="cb12-3">crash_test <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">testing</span>(crash_split)</span></code></pre></div></div>
</details>
</div>
<p>Next, let’s create a base recipe for all models. Note the sequence of steps does matter here: + <code>receipe()</code>: + Any variable on the left-hand side of the tilde (<code>~</code>) is considered the model outcome (here, <code>crash_type</code>). The predictors of the model outcome appear on the right-hand side of the tilde. Here, we use the dot (<code>.</code>) to indicate all the other variables will be used as predictors. + A recipe is also associated with the data set used to create the model. This will usually be the training set, so <code>crash_train</code> here. + <code>step_date()</code>: Creates predictors for the year, month, and day of the week. Here, we’re selecting only the day of the week and month since there are limited observations for earlier years (e.g., 2013, 2014) in the data. + <code>step_rm()</code>: Removes variables; here we use it to remove the original date variable since we no longer want it in the model. + <code>step_normalize()</code>: Centers and scales numeric variables. + <code>step_dummy()</code>: Converts characters or factors (i.e., nominal variables) into one or more numeric binary model terms for the levels of the original data. + <code>step_zv()</code>: Removes indicator variables that only contain a single unique value (e.g.&nbsp;all zeros).</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1">crash_recipe <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">recipe</span>(crash_type <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> ., <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> crash_train) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb13-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">step_date</span>(date, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">features =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dow"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"month"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb13-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">step_rm</span>(date) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb13-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">step_normalize</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all_numeric_predictors</span>(), <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all_outcomes</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb13-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">step_dummy</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all_nominal_predictors</span>(), <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all_outcomes</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb13-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">step_zv</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all_predictors</span>(), <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all_outcomes</span>())</span></code></pre></div></div>
</details>
</div>
<p>Recall that we already partitioned our data set into a <em>training set</em> and <em>test set</em>. This lets us judge whether a given model will generalize well to new data. However, using only two partitions may be insufficient when doing many rounds of hyperparameter tuning. So, it’s usually a good idea to create a <em>validation set</em> as well. We’ll use k-fold cross validation to build a set of 5 validation folds with the function <code>vfold_cv</code>, and we’ll also use stratified sampling to maintain the outcome class proportions.</p>
<p>k-fold cross validation randomly allocates the <code>571184</code> observations in the training set to 5 groups of roughly equal size, called “folds”. For the first iteration of resampling, the first fold is held out for the purpose of measuring performance. The other 80% of the data are used to fit the model. This model, trained on the analysis set, is applied to the assessment set to generate predictions. Then, performance statistics are computed based on those predictions.</p>
<p>In this case, 5-fold cross validation iteratively moves through the folds and leaves a different 20% out each time for model assessment. At the end of this process, there are 5 sets of performance statistics that were created on 5 data sets that were not used in the modeling process. While 5 models were created, these are not used further; we do not keep the models themselves trained on these folds because their only purpose is calculating performance metrics. The final resampling estimates for the model are the averages of the performance statistics replicates.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1">crashes_vfold <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vfold_cv</span>(crash_train, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">v =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">strata =</span> crash_type)</span></code></pre></div></div>
</details>
</div>
<p>We will come back to the validation set after we specified the models.</p>
</section>
<section id="model-1-logistic-regression" class="level2">
<h2 class="anchored" data-anchor-id="model-1-logistic-regression">Model 1: Logistic regression</h2>
<p>All available models are listed at <a href="https://www.tidymodels.org/find/parsnip/">https://www.tidymodels.org/find/parsnip/</a>. Since the outcome variable (<code>crash_type</code>) is categorical, a logistic regression model is a good place to start. Let’s use a model that can perform feature selection during training. The <a href="https://cran.r-project.org/web/packages/glmnet/index.html">glmnet</a> R package fits a generalized linear model via penalized maximum likelihood. This method of estimating the logistic regression slope parameters uses a penalty on the process so that less relevant predictors are driven towards a value of zero. One of the glmnet penalization methods, called the <a href="https://en.wikipedia.org/wiki/Lasso_(statistics)">lasso method</a>, can set the predictor slopes to zero if a large enough penalty is used.</p>
<p>To specify a penalized logistic regression model that uses a feature selection penalty, let’s use the parsnip package with the <a href="https://www.tidymodels.org/find/parsnip/">glmnet engine</a>.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1">lr_mod <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logistic_reg</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">penalty =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mixture =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb15-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_engine</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"glmnet"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb15-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_mode</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"classification"</span>)</span></code></pre></div></div>
</details>
</div>
<p>We’ll set the <code>penalty</code> argument to <code>tune()</code> as a placeholder for now. This is a model hyperparameter that we will <a href="https://www.tidymodels.org/start/tuning/">tune</a> to find the best value for making predictions with our data. Setting <code>mixture</code> to a value of 1 means the glmnet model will potentially remove irrelevant predictors and choose a simpler model (i.e., via least absolute shrinkage and selection operator).</p>
<section id="create-the-workflow" class="level3">
<h3 class="anchored" data-anchor-id="create-the-workflow">Create the workflow</h3>
<p>Now, let’s bundle the model and recipe into a single <code>workflow()</code> object to make management of the R objects easier.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb16-1">lr_workflow <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">workflow</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb16-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_model</span>(lr_mod) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb16-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_recipe</span>(crash_recipe)</span></code></pre></div></div>
</details>
</div>
</section>
<section id="train-and-tune-the-model" class="level3">
<h3 class="anchored" data-anchor-id="train-and-tune-the-model">Train and tune the model</h3>
<p>Before we fit this model, we need to set up a grid of <code>penalty</code> values to tune. Since there is only one hyperparameter to tune here, we can set the grid up manually using a one-column tibble with 30 candidate values.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb17-1">lr_reg_grid <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tibble</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">penalty =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length.out =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>))</span></code></pre></div></div>
</details>
</div>
<p>Now we can use the validation set (<code>crashes_vfold</code>) to estimate the performance of our models by fitting the models on each of the folds and storing the results.</p>
<p>Let’s use <code>tune_grid()</code> to train these penalized logistic regression models. This will fit our model to each resample and evaluate on the heldout set from each resample. We’ll also save the validation set predictions (using <code>control_grid()</code>) so that diagnostic information can be available after the model fit. The area under the ROC curve, precision, recall, and F1-Score metrics will be used to quantify how well the model performs across a continuum of event thresholds.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb18-1">lr_res <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> lr_workflow <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb18-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune_grid</span>(</span>
<span id="cb18-3">    crashes_vfold,</span>
<span id="cb18-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">grid =</span> lr_reg_grid,</span>
<span id="cb18-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">control =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">control_grid</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">save_pred =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>),</span>
<span id="cb18-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metrics =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">metric_set</span>(roc_auc, precision, recall, f_meas)</span>
<span id="cb18-7">  )</span></code></pre></div></div>
</details>
</div>
</section>
<section id="evaluate-the-model" class="level3">
<h3 class="anchored" data-anchor-id="evaluate-the-model">Evaluate the model</h3>
<p>Let’s take a look at the performance for every single fold.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb19-1">lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb19-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_metrics</span>()</span></code></pre></div></div>
</details>
</div>
<p>This isn’t very helpful on it’s own. Let’s visualize the validation set metrics by plotting the area under the ROC curve against the range of penalty values.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb20-1">lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb20-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_metrics</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb20-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.metric</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"roc_auc"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb20-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> penalty, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> mean)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb20-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb20-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb20-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ylab</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Area under the ROC Curve"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb20-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_log10</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> scales<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">label_number</span>())</span></code></pre></div></div>
</details>
</div>
<p>This plot suggests model performance is generally better at the smaller penalty values, meaning the majority of the predictors are important to the model. There’s also a steep drop in the area under the ROC curve towards the highest penalty values. This happens because a large enough penalty will remove all predictors from the model. And when there are no predictors in the model, predictive accuracy takes a nose dive.</p>
<p>Our model performance seems to plateau at the smaller penalty values, so judging performance by the <code>roc_auc</code> metric alone could lead to multiple options for the “best” value for this hyperparameter.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb21-1">lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb21-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">show_best</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"roc_auc"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb21-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(penalty)</span></code></pre></div></div>
</details>
</div>
<p>However, we may want to choose a penalty value further along the x-axis, closer to where we start to see the decline in model performance. For example, candidate model 12 with a penalty value of 0.00137 has basically the same performance as the numerically best model (model 1). However, model 12 might eliminate more predictors than model 1, and generally speaking, fewer irrelevant predictors is better. So if model performance is about the same, we should choose a model with a higher penalty value.</p>
<p>But keep in mind, we also collected other performance metrics. So, let’s take a look at those:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb22" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb22-1">perf_metrics <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"roc_auc"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"precision"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"recall"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f_meas"</span>)</span>
<span id="cb22-2"></span>
<span id="cb22-3">get_metrics <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) {</span>
<span id="cb22-4">  lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb22-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">show_best</span>(x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb22-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(penalty)</span>
<span id="cb22-7">}</span>
<span id="cb22-8"></span>
<span id="cb22-9"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(perf_metrics, get_metrics)</span></code></pre></div></div>
</details>
</div>
<p>Let’s select model 15 in this case:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb23-1">lr_best <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb23-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select_best</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metric =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f_meas"</span>)</span></code></pre></div></div>
</details>
</div>
<p>Now we can use the predictions to create a confusion matrix with <code>conf_mat()</code>.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb24" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb24-1">lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb24-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> lr_best) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb24-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">conf_mat</span>(crash_type, .pred_class)</span></code></pre></div></div>
</details>
</div>
<p>The confusion matrix can also be visualized in different formats using <code>autoplot()</code>. I personally like the <code>heatmap</code> type, but there are others that can be used as well.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1">lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb25-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> lr_best)  <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb25-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">conf_mat</span>(crash_type, .pred_class) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb25-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"heatmap"</span>)</span></code></pre></div></div>
</details>
</div>
<p>Let’s visualize the validation set ROC curve:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb26" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1">lr_auc <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb26-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> lr_best) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb26-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">roc_curve</span>(crash_type, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.pred_injury and / or tow due to crash</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb26-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">model =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Logistic Regression"</span>)</span>
<span id="cb26-5"></span>
<span id="cb26-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(lr_auc)</span></code></pre></div></div>
</details>
</div>
<p>We can also make a ROC cure for the 5 folds. Since the category we are predicting is the injury/tow level in the <code>crash_type</code> factor, we provide <code>roc_curve()</code> with the relevant class probability <code>.pred_injury and / or tow due to crash</code>:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb27" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb27-1">lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb27-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> lr_best) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb27-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(id) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb27-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">roc_curve</span>(crash_type, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.pred_injury and / or tow due to crash</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb27-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>()</span></code></pre></div></div>
</details>
</div>
<p>Finally, we can also look at the predicted probability distributions for our two classes:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb28" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb28-1">lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb28-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> lr_best) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb28-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb28-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_density</span>(</span>
<span id="cb28-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.pred_injury and / or tow due to crash</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb28-6">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> crash_type),</span>
<span id="cb28-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb28-8">  )</span></code></pre></div></div>
</details>
</div>
<p>The level of performance generated by this logistic regression model isn’t great, but it’s better than an educated guess. Based on the frequency of crashes that result in injuries or vehicles being towed in the entire data set, we would expect about 24.6% of crashes to have these outcomes. However, based on the features we’ve selected here, our model correctly predicted these crash types about 33% of the time. So, we’ve improved our predictions, but only by about 8%. Perhaps the linear nature of the prediction equation could be limiting our model’s performance. As a next step, we might consider using a non-linear model, like a tree-based ensemble method.</p>
</section>
</section>
<section id="model-2-random-forest" class="level2">
<h2 class="anchored" data-anchor-id="model-2-random-forest">Model 2: Random forest</h2>
<p>An effective, low-maintenance, non-linear modeling approach is a random forest, which tends to be more flexible than logistic regression. A random forest is an ensemble model that often consists of thousands of decision trees. Each individual tree sees a slightly different version of the training set and learns a sequence of splitting rules to predict new data. Random forests require very little preprocessing and can handle many types of predictors (e.g., skewed, continuous, categorical, etc.). Although the default hyperparameters for random forests tend to give reasonable results, we’ll tune two hyperparameters that could improve performance. This should also help since we’ll be limiting the number of trees used to 20 to speed up the time it takes to fit the model.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb29" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1">rf_mod <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand_forest</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mtry =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">min_n =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">trees =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb29-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_engine</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ranger"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">importance =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"impurity"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb29-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_mode</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"classification"</span>)</span></code></pre></div></div>
</details>
</div>
<p>For the hyperparameters in this model, we use <code>tune()</code> as a placeholder for the <code>mtry</code> and <code>min_n</code> argument values. The <code>mtry</code> hyperparameter sets the number of predictor variables that each node in the decision tree sees and learns about. The <code>min_n</code> hyperparameter sets the minimum <code>n</code> to split at any node. We also added <code>importance = "impurity"</code> when setting the engine. This will provide variable importance scores for this model, which gives some insight into which predictors drive model performance.</p>
<section id="create-the-workflow-1" class="level3">
<h3 class="anchored" data-anchor-id="create-the-workflow-1">Create the workflow</h3>
<p>Next, let’s bundle the recipe and model.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb30" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb30-1">rf_workflow <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">workflow</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb30-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_model</span>(rf_mod) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb30-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_recipe</span>(crash_recipe)</span></code></pre></div></div>
</details>
</div>
</section>
<section id="train-and-tune-the-model-1" class="level3">
<h3 class="anchored" data-anchor-id="train-and-tune-the-model-1">Train and tune the model</h3>
<p>Since we have more than one hyperparameter to tune in this model, let’s use a space-filling design with 25 candidate models.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb31" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb31-1">rf_res <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rf_workflow <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb31-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune_grid</span>(</span>
<span id="cb31-3">    crashes_vfold,</span>
<span id="cb31-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">grid =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>,</span>
<span id="cb31-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">control =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">control_grid</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">save_pred =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>),</span>
<span id="cb31-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metrics =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">metric_set</span>(roc_auc, precision, recall, f_meas)</span>
<span id="cb31-7">  )</span></code></pre></div></div>
</details>
</div>
</section>
<section id="evaluate-the-model-1" class="level3">
<h3 class="anchored" data-anchor-id="evaluate-the-model-1">Evaluate the model</h3>
<p>Out of the 25 candidates, here are the top 5 random forest models based on their F1-Scores:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb32" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb32-1">rf_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb32-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">show_best</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metric =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f_meas"</span>)</span></code></pre></div></div>
</details>
</div>
<p>Let’s select the best model according to the F1-Score. Our final tuning parameter values are:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb33" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb33-1">rf_best <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rf_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb33-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select_best</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metric =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f_meas"</span>)</span>
<span id="cb33-3"></span>
<span id="cb33-4">rf_best</span></code></pre></div></div>
</details>
</div>
<p>To calculate the data needed to plot the ROC curve, we use <code>collect_predictions()</code>. This is only possible after tuning with <code>control_grid(save_pred = TRUE)</code>. Now, we can use the predictions to create a confusion matrix with <code>conf_mat()</code>.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb34" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb34-1">rf_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb34-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb34-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">conf_mat</span>(crash_type, .pred_class)</span></code></pre></div></div>
</details>
</div>
<p>To filter the predictions for only our best random forest model, we can use the parameters argument and pass it our tibble with the best hyperparameter values from tuning, which we called <code>rf_best</code>.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb35" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb35-1">rf_auc <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rf_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb35-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> rf_best) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb35-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">roc_curve</span>(crash_type, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.pred_injury and / or tow due to crash</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb35-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">model =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Random Forest"</span>)</span>
<span id="cb35-5"></span>
<span id="cb35-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(rf_auc)</span></code></pre></div></div>
</details>
</div>
</section>
</section>
<section id="compare-models" class="level2">
<h2 class="anchored" data-anchor-id="compare-models">Compare models</h2>
<p>Now, it’s time to compare the models. The first thing we’ll do is extract the performance metrics from each of the models and combine them into a single data frame.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb36" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb36-1">lr_metrics <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> lr_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb36-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_metrics</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb36-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">model =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Logistic Regression"</span>)</span>
<span id="cb36-4"></span>
<span id="cb36-5">rf_metrics <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rf_res <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb36-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_metrics</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb36-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">model =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Random Forest"</span>)</span>
<span id="cb36-8"></span>
<span id="cb36-9">compare_mod <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(lr_metrics, rf_metrics)</span></code></pre></div></div>
</details>
</div>
<p>Fist, let’s take a look at the average F1-Score for each model:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb37" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb37-1">compare_mod <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb37-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(.metric <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f_meas"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb37-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(model) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb37-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarize</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_f_meas =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(mean)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb37-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">model =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fct_reorder</span>(model, avg_f_meas)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb37-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(model, avg_f_meas, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> model)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb37-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_col</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb37-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_flip</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb37-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Blues"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb37-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_text</span>(</span>
<span id="cb37-11">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>,</span>
<span id="cb37-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">round_half_up</span>(avg_f_meas, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> avg_f_meas <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> .<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>)</span>
<span id="cb37-13">  )</span></code></pre></div></div>
</details>
</div>
<p>Not much of a difference here. So, we may also want to check out the average ROC curve for each model:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb38" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb38-1">compare_mod <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb38-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(.metric <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"roc_auc"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb38-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(model) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb38-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarize</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_roc =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(mean)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb38-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">model =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fct_reorder</span>(model, avg_roc)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb38-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(model, avg_roc, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> model)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb38-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_col</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb38-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_flip</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb38-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Blues"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb38-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_text</span>(</span>
<span id="cb38-11">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>,</span>
<span id="cb38-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">round_half_up</span>(avg_roc, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> avg_roc <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> .<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>)</span>
<span id="cb38-13">  )</span></code></pre></div></div>
</details>
</div>
<p>Looks like our random forest model did a bit better here, but still pretty close. Let’s plot the validation set ROC curves for the top penalized logistic regression model and random forest model:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb39" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb39-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(rf_auc, lr_auc) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb39-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> specificity, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> sensitivity, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col =</span> model)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb39-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_path</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lwd =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.5</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb39-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_abline</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lty =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb39-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_equal</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb39-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_viridis_d</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">option =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"plasma"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end =</span> .<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>)</span></code></pre></div></div>
</details>
</div>
<p>Overall, the model results are pretty similar, but the random forest model did seem to perform better than the logistic regression model. In this case, I highlighted the ROC AUC and F1-Score performance metrics, but the “best” performance metric will always depend on the question you are trying to answer with your model. For example, in some cases, you might be much more concerned about false negatives than you are false positives (e.g., when predicting severe storms). In other situations, you might only be concerned about each these to the extent they influence a model’s precision (e.g., when predicting profitable stocks).</p>
<p>To keep things simple, let’s stick with the ROC AUC metric in this case. AUC stands for area under the curve. What curve, you may ask? The ROC curve, specifically. The ROC curve plots the tradeoff between the true positive rate (sensitivity) and and false positive rate (1 - specificity). Ideally, we want to maximize the true positive rate and minimize the false positive rate.</p>
<p>Let’s find the maximum mean ROC AUC:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb40" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1">compare_mod <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb40-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(.metric <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"roc_auc"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb40-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(model) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb40-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarize</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_roc_auc =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(mean)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb40-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">slice_max</span>(avg_roc_auc)</span></code></pre></div></div>
</details>
</div>
<p>Now, it’s time to fit the best model one last time to the full training set. Then, we can evaluate the resulting final model on the test set.</p>
</section>
<section id="last-fit" class="level2">
<h2 class="anchored" data-anchor-id="last-fit">Last fit</h2>
<p>Recall that our goal was to predict whether a traffic crash would result in an injury or a vehicle being towed based on a priori situational factors. Given the results, we determined the random forest model performed better than the penalized logistic regression model. We also know learned the best model hyperparameters from the <code>rf_best</code> object we created earlier. Now, we just need to fit the final model on all the rows of data not originally held out for testing (i.e., the training and validation sets) and evaluate the model performance one more time with the test set.</p>
<p>The <a href="https://www.tidymodels.org/start/tuning/">tune</a> package contains the function <code>last_fit()</code>, which fits a model to the whole training data and evaluates it on the test set. We just need to provide the workflow object of the best model and data split object (<em>not</em> the training data).</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb41" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb41-1">last_rf_mod <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand_forest</span>(</span>
<span id="cb41-2">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mtry =</span> rf_best<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>mtry,</span>
<span id="cb41-3">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">min_n =</span> rf_best<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>min_n,</span>
<span id="cb41-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">trees =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span></span>
<span id="cb41-5">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb41-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_engine</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ranger"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">importance =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"impurity"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb41-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_mode</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"classification"</span>)</span>
<span id="cb41-8"></span>
<span id="cb41-9">last_rf_workflow <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rf_workflow <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb41-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">update_model</span>(last_rf_mod)</span>
<span id="cb41-11"></span>
<span id="cb41-12">last_rf_fit <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> last_rf_workflow <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb41-13">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">last_fit</span>(crash_split)</span></code></pre></div></div>
</details>
</div>
<p>And these are the final performance metrics:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb42" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb42-1">last_rf_fit <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb42-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_metrics</span>()</span></code></pre></div></div>
</details>
</div>
<p>Remember, if a model fit to the training data set also fits the test data set well, we can be reasonably confident that minimal overfitting has taken place.</p>
<p>To learn more about the model, we can look at the variable importance scores in the <code>.workflow</code> column. We pluck the first element from the column, and pull out the fit from the workflow object. Then, we can use the <a href="https://github.com/koalaverse/vip/">vip</a> package to visualize the variable importance scores for the top features.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb43" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb43-1">last_rf_fit <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb43-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pluck</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".workflow"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb43-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">extract_fit_parsnip</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb43-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vip</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">num_features =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span></code></pre></div></div>
</details>
</div>
<p>By far, the most important factor in whether a crash results in injuries or the vehicle being towed is if the first collision in the crash involved a pedestrian or not.</p>
<p>Let’s take a quick look at the confusion matrix:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb44" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb44-1">last_rf_fit <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb44-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb44-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">conf_mat</span>(crash_type, .pred_class) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb44-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"heatmap"</span>)</span></code></pre></div></div>
</details>
</div>
<p>And, let’s create the final ROC curve:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb45" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb45-1">last_rf_fit <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb45-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect_predictions</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb45-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">roc_curve</span>(crash_type, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.pred_injury and / or tow due to crash</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb45-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>()</span></code></pre></div></div>
</details>
</div>
<p>The results from the validation set and test set performance statistics are very close, so we can be reasonably confident the random forest model with the selected features and hyperparameters would perform well when predicting new data.</p>
<blockquote class="blockquote">
<p>Special thanks to <a href="https://www.linkedin.com/in/atriplett">Drew Triplett</a> for his helpful comments on an earlier draft of this post!</p>
</blockquote>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2021,
  author = {},
  title = {Predicting Pileups: {Using} {ML} to Predict {Chicago} Crash
    Types},
  date = {2021-12-18},
  url = {https://www.jrwinget.com/blog/2021-12-18_predicting-pileups/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2021" class="csl-entry quarto-appendix-citeas">
<span>“Predicting Pileups: Using ML to Predict Chicago Crash
Types.”</span> 2021. December 18, 2021. <a href="https://www.jrwinget.com/blog/2021-12-18_predicting-pileups/">https://www.jrwinget.com/blog/2021-12-18_predicting-pileups/</a>.
</div></div></section></div> ]]></description>
  <category>Data Science</category>
  <guid>https://www.jrwinget.com/blog/2021-12-18_predicting-pileups/</guid>
  <pubDate>Sat, 18 Dec 2021 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2021-12-18_predicting-pileups/featured.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>How to create a Twitter bot using R</title>
  <link>https://www.jrwinget.com/blog/2019-05-08_create-a-twitter-bot/</link>
  <description><![CDATA[ 




<div class="callout-alert">
<p>Note: Twitter’s (now X) API policies have changed significantly since this post was written. Leaving this post up for historical purposes and conceptual understanding, but be aware that the code and methods described here may no longer work with the current API.</p>
</div>
<p>Last week, I decided to kill an afternoon by creating a <a href="https://twitter.com/rathrgeneratr">Twitter bot</a>. Why? Mostly, I was procrastinating on revisions for a manuscript and looking for a small R project to practice my programming skills. Creating a Twitter bot seemed like a great option: Bots can follow other users, retweet content from others, or post original content, and all of this is basically controlled by a script(s).</p>
<p>This project is surprisingly easy: If you’re familiar with R (e.g., able to write a function), you shouldn’t have any trouble creating something like this. If you’re still learning how to write functions, this project is great practice!</p>
<section id="step-1-what-will-the-bot-tweet" class="level2">
<h2 class="anchored" data-anchor-id="step-1-what-will-the-bot-tweet">Step 1: What will the bot tweet?</h2>
<p>The conceptual part of this step was the toughest for me: What content do I want the bot to tweet? I could have done something practical like automatically tweet new blog content or retweet important important information/news. But, I wanted to have fun with this. Others have made some pretty hilarious bots (e.g., <a href="https://twitter.com/WhyDoesR">WhyDoesR</a> or <a href="https://twitter.com/TwoHeadlines">TwoHeadlines</a>), so I wanted to create something simple that could get a few laughs. So, I decided to make a random “Would You Rather” generator that pits outrageous or terrible situations against one another.</p>
<p>To do this, I set up a little “database” containing a list of the situations in a .csv file (you can view that file on the <a href="https://gitlab.com/jrwinget/rathr-generatr">GitLab repo</a> for the bot). I also wanted to add pictures to the posts, so I found 11 open source images online and stored them into a “img” directory. Once the database and image directory are created, they need to be loaded into R.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(here)</span>
<span id="cb1-3"></span>
<span id="cb1-4">wyr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read_csv</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">here</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"wyr-db.csv"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col_names =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb1-5">pictures <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list.files</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">here</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"img"</span>))</span></code></pre></div></div>
</details>
</div>
<p>Next, the bot needs to be able to randomly select two situations from the database and combine them into a sentence. A function would be perfect for this:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(glue)</span>
<span id="cb2-2"></span>
<span id="cb2-3">would_you_rather <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>() {</span>
<span id="cb2-4">  choices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample_n</span>(wyr, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-5">  a <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">slice</span>(choices, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-6">  b <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">slice</span>(choices, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-7">  sentence <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glue</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Would you rather {a} or {b}?"</span>)</span>
<span id="cb2-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(sentence)</span>
<span id="cb2-9">}</span></code></pre></div></div>
</details>
</div>
<p>Now, the bot has a way of creating a sentence, but it still needs to actually generate one to tweet. It also needs to select a picture to tweet with the generated sentence. To create a sentence, we can use our new function and store the result in an object called “tweet”. To randomly select a picutre, we just need to sample 1 of the 11 and store the name of the file as an object called “img”.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1">tweet <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">would_you_rather</span>()</span>
<span id="cb3-2">img <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample</span>(pictures, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div></div>
</details>
</div>
</section>
<section id="step-2-connect-to-twitter" class="level2">
<h2 class="anchored" data-anchor-id="step-2-connect-to-twitter">Step 2: Connect to Twitter</h2>
<p>Because this bot will be tweeting from the R console, we have to register a new app with Twitter. Michael Kearney has a great <a href="https://rtweet.info/articles/auth.html">tutorial on this</a> using his <code>rtweet</code> package. Basically, load the <code>rtweet</code> package and connect to Twitter’s API using credentials stored in the environment. Once the credentials are stored, use the <code>get_tokens</code> function to fetch and load them.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rtweet)</span>
<span id="cb4-2"></span>
<span id="cb4-3">token <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_tokens</span>()</span></code></pre></div></div>
</details>
</div>
<p>Now, the bot can tweet a randomized “Would You Rather” situation with the <code>post_tweet</code> function. I decided to also include a hashtag in the tweet, which is really easy to do using the <code>glue</code> package. To add the randomly selected picture to the tweet, just include the file path to that picture in the <code>media</code> argument.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># tweet it</span></span>
<span id="cb5-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">post_tweet</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">status =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glue</span>(</span>
<span id="cb5-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"{tweet}</span></span>
<span id="cb5-4"></span>
<span id="cb5-5"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">#wouldyourather"</span>),</span>
<span id="cb5-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">media =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glue</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"img/{img}"</span>))</span></code></pre></div></div>
</details>
</div>
<p>I also wanted to collect all of the tweets the bot produces, so I made a log file to store them.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1">line <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.character</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Sys.time</span>()), tweet, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sep =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">" "</span>)</span>
<span id="cb6-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">write</span>(line, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">file =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">here</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"wyr-tweets.log"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">append =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span></code></pre></div></div>
</details>
</div>
</section>
<section id="step-3-automate-the-bot" class="level2">
<h2 class="anchored" data-anchor-id="step-3-automate-the-bot">Step 3: Automate the bot</h2>
<p>We now have a script that will randomly generate a “Would You Rather” situation along with a randomly chosen picture. However, it would be annoying to manually operate the bot every time we wanted it to tweet. Besides, doing so would undermine the entire point of making a “bot”. So let’s have the computer do this instead.</p>
<p>However, before the computer can understand the R script, we have to add a line of code to the top of the script (note ‘#!’ is important here):</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#! /usr/bin/env Rscript</span></span></code></pre></div></div>
</details>
</div>
<p>This is basically turns our script into something the computer can execute. This is good, but having the script loaded on a server would be even better because the script can run whether or not your personal computer is on. Luckily, I happened to already have a server running, so I was able to simply load everything on there and schedule a cron job. Cron jobs basically tell the computer to run a certain command at a certain time (more on cron jobs <a href="https://help.ubuntu.com/community/CronHowto">here</a>).</p>
<p>If you’ve never scheduled a cron job before, it’s a relatively simple process (note: to use this method, you will need a Mac or Linux OS; for Windows OS, use Windows Task Scheduler). First, open the terminal and type:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode numberSource bash number-lines code-with-copy"><code class="sourceCode bash"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">crontab</span> <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-e</span></span></code></pre></div></div>
</details>
</div>
<p>This opens your personal crontab (i.e., the configuration file). In every line, you can define one command to run and its schedule. The structure of the format is:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode numberSource bash number-lines code-with-copy"><code class="sourceCode bash"><span id="cb9-1"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">minute</span> hour day-of-month month day-of-week command</span></code></pre></div></div>
</details>
</div>
<p>Using an asterisk as a value represents “any”. For example, to run a command every Monday at 8am, the format would be:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode numberSource bash number-lines code-with-copy"><code class="sourceCode bash"><span id="cb10-1"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">0</span> 8 <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> 1 /path/to/command</span></code></pre></div></div>
</details>
</div>
<p>For this project, the command will tell the computer to execute the bot script we wrote in R. I chose to combine all of the files for the bot (e.g., bot script, database, pictures, etc.) into an R project on the server, so the command I created changes the working directory to the project directory (the <code>cd</code> command) and then runs the script (the <code>Rscript</code> command). I also chose to run the bot twice a day:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode numberSource bash number-lines code-with-copy"><code class="sourceCode bash"><span id="cb11-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># m h  dom mon dow    command</span></span>
<span id="cb11-2"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">0</span> 8 <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> cd ~/2019-05-02_would-you-rather-bot<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">;</span> <span class="ex" style="color: null;
background-color: null;
font-style: inherit;">Rscript</span> bot-script.R      <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># run at 8am CST</span></span>
<span id="cb11-3"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">15</span> 17 <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">*</span> cd ~/2019-05-02_would-you-rather-bot<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">;</span> <span class="ex" style="color: null;
background-color: null;
font-style: inherit;">Rscript</span> bot-script.R    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># run at 5:15pm CST</span></span></code></pre></div></div>
</details>
</div>
<p>You certainly don’t <em>have</em> to run the bot on a server; servers just make things easier and more consistent in this case. If you don’t have access to a server, the process will be basically the same for running the bot on a personal computer. You’ll just need to make sure the computer is on (and awake!) at the scheduled time(s) for it to automatically tweet. Or, you can <a href="https://dunia-it.com/wake-your-linux-up-from-sleep-for-a-cron-job/">wake your linux up from sleep for a cron job</a>.</p>
</section>
<section id="wrapping-up" class="level2">
<h2 class="anchored" data-anchor-id="wrapping-up">Wrapping up</h2>
<p>And, that’s it! Here’s the completed script:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#!/usr/bin/env Rscript</span></span>
<span id="cb12-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># would_you_rather_bot 0.1</span></span>
<span id="cb12-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># author: jeremy r. winget</span></span>
<span id="cb12-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span>
<span id="cb12-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rtweet)</span>
<span id="cb12-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(here)</span>
<span id="cb12-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(glue)</span>
<span id="cb12-8"></span>
<span id="cb12-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># authenticate</span></span>
<span id="cb12-10">token <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_tokens</span>()</span>
<span id="cb12-11"></span>
<span id="cb12-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># read in data</span></span>
<span id="cb12-13">wyr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read_csv</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">here</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"wyr-db.csv"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col_names =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb12-14">pictures <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list.files</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">here</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"img"</span>))</span>
<span id="cb12-15"></span>
<span id="cb12-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># function to generate sentence</span></span>
<span id="cb12-17">would_you_rather <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>() {</span>
<span id="cb12-18">  choices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample_n</span>(wyr, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb12-19">  a <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">slice</span>(choices, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb12-20">  b <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">slice</span>(choices, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb12-21">  sentence <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glue</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Would you rather {a} or {b}?"</span>)</span>
<span id="cb12-22">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(sentence)</span>
<span id="cb12-23">}</span>
<span id="cb12-24"></span>
<span id="cb12-25"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># generate question and picture</span></span>
<span id="cb12-26">tweet <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">would_you_rather</span>()</span>
<span id="cb12-27">img <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample</span>(pictures, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb12-28"></span>
<span id="cb12-29"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># tweet it</span></span>
<span id="cb12-30"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">post_tweet</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">status =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glue</span>(</span>
<span id="cb12-31">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"{tweet}</span></span>
<span id="cb12-32"></span>
<span id="cb12-33"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">#wouldyourather"</span>),</span>
<span id="cb12-34">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">media =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glue</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"img/{img}"</span>))</span>
<span id="cb12-35"></span>
<span id="cb12-36"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># create log entry</span></span>
<span id="cb12-37">line <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.character</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Sys.time</span>()), tweet, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sep =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">" "</span>)</span>
<span id="cb12-38"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">write</span>(line, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">file =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">here</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"wyr-tweets.log"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">append =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span></code></pre></div></div>
</details>
</div>
<p>If you create your own Twitter bot with R (or if this tutorial inspired any other projects), please <a href="https://twitter.com/_jwinget">share it with me</a>. I’d love to hear what you did!</p>
</section>
<section id="potential-future-features" class="level2">
<h2 class="anchored" data-anchor-id="potential-future-features">Potential future features</h2>
<p>Right now, this bot is pretty basic. I’ve had a few ideas for additional features/adjustments I may or may not end up incorporating:</p>
<ol type="1">
<li>I’d like to add more content to the the database. I basically just googled “Would You Rather” questions and chose some of the more outrageous ones. But, there aren’t a lot of different situations (and some are pretty lame) listed in the database, which can sometimes lead to repeated situations being tweeted. If anyone has any situations they’d like to add, feel free to submit merge request!</li>
<li>If people like the bot and start engaging with it, I thought it’d be fun to retweet popular comments/answers. I’m not sure what that procedure would be yet, though.</li>
<li>It might be useful to create different categories of “Would You Rather” situations (e.g., outrageous situations, relationship situations, entertainment situations, etc.). If the situations aren’t chosen from the same category, it can lead to dumb questions (like <a href="https://twitter.com/rathrgeneratr/status/1125162015293673473">this one</a>).</li>
</ol>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2019,
  author = {},
  title = {How to Create a {Twitter} Bot Using {R}},
  date = {2019-05-08},
  url = {https://www.jrwinget.com/blog/2019-05-08_create-a-twitter-bot/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2019" class="csl-entry quarto-appendix-citeas">
<span>“How to Create a Twitter Bot Using R.”</span> 2019. May 8, 2019.
<a href="https://www.jrwinget.com/blog/2019-05-08_create-a-twitter-bot/">https://www.jrwinget.com/blog/2019-05-08_create-a-twitter-bot/</a>.
</div></div></section></div> ]]></description>
  <category>Software Development</category>
  <guid>https://www.jrwinget.com/blog/2019-05-08_create-a-twitter-bot/</guid>
  <pubDate>Wed, 08 May 2019 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2019-05-08_create-a-twitter-bot/featured.png" medium="image" type="image/png" height="103" width="144"/>
</item>
<item>
  <title>EDA: Chicago red light camera violations</title>
  <link>https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/</link>
  <description><![CDATA[ 




<section id="chicago-red-light-camera-violations" class="level1">
<h1>Chicago red light camera violations</h1>
<p>In this post, I walk through a simple exploratory data analysis of red light camera violations in Chicago.</p>
<section id="data-import" class="level2">
<h2 class="anchored" data-anchor-id="data-import">Data import</h2>
<p>Data downloaded from the <a href="https://data.cityofchicago.org/Transportation/Red-Light-Camera-Violations/spqx-js37">Chicago Data Portal</a>.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(modelr)</span>
<span id="cb1-3">(red_light_raw <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read_csv</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2019-04-28_chi-red-light.csv"</span>))</span>
<span id="cb1-4"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # A tibble: 466,107 × 10</span></span>
<span id="cb1-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    INTERSECTION   `CAMERA ID` ADDRESS `VIOLATION DATE` VIOLATIONS `X COORDINATE`</span></span>
<span id="cb1-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    &lt;chr&gt;                &lt;dbl&gt; &lt;chr&gt;   &lt;chr&gt;                 &lt;dbl&gt;          &lt;dbl&gt;</span></span>
<span id="cb1-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  1 IRVING PARK A…        2763 4700 W… 04/09/2015                4             NA</span></span>
<span id="cb1-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  2 VAN BUREN AND…        2054 2400 W… 04/14/2015                5             NA</span></span>
<span id="cb1-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  3 115TH AND HAL…        2552 11500 … 04/08/2015                5             NA</span></span>
<span id="cb1-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  4 IRVING PARK A…        2764 4700 W… 04/19/2015                4             NA</span></span>
<span id="cb1-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  5 ELSTON AND IR…        1503 3700 W… 04/23/2015                3             NA</span></span>
<span id="cb1-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  6 31ST AND CALI…        2064 2800 W… 09/14/2014                3             NA</span></span>
<span id="cb1-13"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  7 31ST AND CALI…        2064 2800 W… 12/16/2014                1             NA</span></span>
<span id="cb1-14"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  8 31ST AND CALI…        2064 2800 W… 01/30/2015                4             NA</span></span>
<span id="cb1-15"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  9 115TH AND HAL…        2552 11500 … 03/28/2015               14             NA</span></span>
<span id="cb1-16"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 10 WENTWORTH AND…        2261 5500 S… 04/06/2015               11             NA</span></span>
<span id="cb1-17"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 466,097 more rows</span></span>
<span id="cb1-18"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 4 more variables: `Y COORDINATE` &lt;dbl&gt;, LATITUDE &lt;dbl&gt;, LONGITUDE &lt;dbl&gt;,</span></span>
<span id="cb1-19"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## #   LOCATION &lt;chr&gt;</span></span></code></pre></div></div>
</details>
</div>
<p>These data reflect the daily volume of violations created by the City of Chicago Red Light Program for each camera since July 1, 2014.</p>
<ul>
<li><strong>INTERSECTION</strong> = Intersection of the location of the red light enforcement camera(s). There may be more than one camera at each intersection</li>
<li><strong>CAMERA ID</strong> = A unique ID for each physical camera at an intersection, which may contain more than one camera</li>
<li><strong>ADDRESS</strong> = The address of the physical camera (CAMERA ID). The address may be the same for all cameras or different, based on the physical installation of each camera</li>
<li><strong>VIOLATION DATE</strong> = The date of when the violations occurred. NOTE: The citation may be issued on a different date</li>
<li><strong>VIOLATIONS</strong> = Number of violations for each camera on a particular day</li>
<li><strong>X COORDINATE</strong> = The X Coordinate, measured in feet, of the location of the camera. Geocoded using Illinois State Plane East</li>
<li><strong>Y COORDINATE</strong> = The Y Coordinate, measured in feet, of the location of the camera. Geocoded using Illinois State Plane East</li>
<li><strong>LATITUDE</strong> = The latitude of the physical location of the camera(s) based on the ADDRESS column. Geocoded using the WGS84</li>
<li><strong>LONGITUDE</strong> = The longitude of the physical location of the camera(s) based on the ADDRESS column. Geocoded using the WGS84</li>
<li><strong>LOCATION</strong> = The coordinates of the camera(s) based on the LATITUDE and LONGITUDE columns. Geocoded using the WGS84</li>
</ul>
</section>
<section id="data-clean" class="level2">
<h2 class="anchored" data-anchor-id="data-clean">Data clean</h2>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1">(red_light <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> red_light_raw <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-2">  janitor<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">clean_names</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">separate</span>(violation_date, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"month"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"day"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"year"</span>), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate_at</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vars</span>(month, day, year),</span>
<span id="cb2-5">            <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(.)))</span>
<span id="cb2-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # A tibble: 466,107 × 12</span></span>
<span id="cb2-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    intersection      camera_id address month   day  year violations x_coordinate</span></span>
<span id="cb2-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    &lt;chr&gt;                 &lt;dbl&gt; &lt;chr&gt;   &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;      &lt;dbl&gt;        &lt;dbl&gt;</span></span>
<span id="cb2-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  1 IRVING PARK AND …      2763 4700 W…     4     9  2015          4           NA</span></span>
<span id="cb2-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  2 VAN BUREN AND WE…      2054 2400 W…     4    14  2015          5           NA</span></span>
<span id="cb2-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  3 115TH AND HALSTED      2552 11500 …     4     8  2015          5           NA</span></span>
<span id="cb2-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  4 IRVING PARK AND …      2764 4700 W…     4    19  2015          4           NA</span></span>
<span id="cb2-13"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  5 ELSTON AND IRVIN…      1503 3700 W…     4    23  2015          3           NA</span></span>
<span id="cb2-14"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  6 31ST AND CALIFOR…      2064 2800 W…     9    14  2014          3           NA</span></span>
<span id="cb2-15"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  7 31ST AND CALIFOR…      2064 2800 W…    12    16  2014          1           NA</span></span>
<span id="cb2-16"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  8 31ST AND CALIFOR…      2064 2800 W…     1    30  2015          4           NA</span></span>
<span id="cb2-17"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  9 115TH AND HALSTED      2552 11500 …     3    28  2015         14           NA</span></span>
<span id="cb2-18"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 10 WENTWORTH AND GA…      2261 5500 S…     4     6  2015         11           NA</span></span>
<span id="cb2-19"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 466,097 more rows</span></span>
<span id="cb2-20"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 4 more variables: y_coordinate &lt;dbl&gt;, latitude &lt;dbl&gt;, longitude &lt;dbl&gt;,</span></span>
<span id="cb2-21"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## #   location &lt;chr&gt;</span></span></code></pre></div></div>
</details>
</div>
</section>
<section id="simple-eda" class="level2">
<h2 class="anchored" data-anchor-id="simple-eda">Simple EDA</h2>
<p>I’m going to look at the number of red light violations by intersection across time.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(red_light, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(year <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> month <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>, violations)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> intersection))</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Turns out, this graph isn’t very useful. It’s hard to get an idea of what’s really going on here because there is so much lumped at bottom. For now, I’m going to focus on the more popular intersections (i.e., those with more violations).</p>
</section>
<section id="focus-on-popular-intersections" class="level2">
<h2 class="anchored" data-anchor-id="focus-on-popular-intersections">Focus on popular intersections</h2>
<p>These results might be a bit misleading–maybe popular intersections are fundamentally different (e.g., more dangerous, more lucrative). But, it will at least be a good place to start. Since there are 173 intersections in this dataset, I’m going to arbitrarily select all intersections that lead to greater than 5.5 red light violations on average.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1">(intersections <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> red_light <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(intersection) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarize</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(violations)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">desc</span>(avg)))</span>
<span id="cb4-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # A tibble: 183 × 2</span></span>
<span id="cb4-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    intersection                avg</span></span>
<span id="cb4-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    &lt;chr&gt;                     &lt;dbl&gt;</span></span>
<span id="cb4-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  1 CICERO AND I55             33.8</span></span>
<span id="cb4-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  2 LAKE AND UPPER WACKER      31.4</span></span>
<span id="cb4-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  3 LAKE SHORE DR AND BELMONT  26.9</span></span>
<span id="cb4-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  4 VAN BUREN AND WESTERN      20.8</span></span>
<span id="cb4-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  5 LAFAYETTE AND 87TH         18.7</span></span>
<span id="cb4-13"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  6 STATE AND 79TH             16.6</span></span>
<span id="cb4-14"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  7 STONEY ISLAND AND 76TH     15.8</span></span>
<span id="cb4-15"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  8 LINCOLN AND MCCORMICK      15.2</span></span>
<span id="cb4-16"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  9 WENTWORTH AND GARFIELD     14.8</span></span>
<span id="cb4-17"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 10 ARCHER AND CICERO          14.4</span></span>
<span id="cb4-18"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 173 more rows</span></span>
<span id="cb4-19"></span>
<span id="cb4-20">(red_light_popular <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> red_light <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-21">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">semi_join</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(intersections, avg <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5.5</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-22">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">date =</span> year <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (month <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>))</span>
<span id="cb4-23"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Joining with `by = join_by(intersection)`</span></span>
<span id="cb4-24"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # A tibble: 135,780 × 13</span></span>
<span id="cb4-25"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    intersection      camera_id address month   day  year violations x_coordinate</span></span>
<span id="cb4-26"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    &lt;chr&gt;                 &lt;dbl&gt; &lt;chr&gt;   &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;      &lt;dbl&gt;        &lt;dbl&gt;</span></span>
<span id="cb4-27"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  1 VAN BUREN AND WE…      2054 2400 W…     4    14  2015          5           NA</span></span>
<span id="cb4-28"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  2 115TH AND HALSTED      2552 11500 …     4     8  2015          5           NA</span></span>
<span id="cb4-29"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  3 115TH AND HALSTED      2552 11500 …     3    28  2015         14           NA</span></span>
<span id="cb4-30"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  4 WENTWORTH AND GA…      2261 5500 S…     4     6  2015         11           NA</span></span>
<span id="cb4-31"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  5 115TH AND HALSTED      2552 11500 …     9     1  2014         14           NA</span></span>
<span id="cb4-32"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  6 115TH AND HALSTED      2552 11500 …    10    19  2014         18           NA</span></span>
<span id="cb4-33"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  7 115TH AND HALSTED      2552 11500 …     7    12  2014         31           NA</span></span>
<span id="cb4-34"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  8 31ST ST AND MART…      2121 3100 S…     7     7  2014         21           NA</span></span>
<span id="cb4-35"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  9 115TH AND HALSTED      2552 11500 …    11    18  2014          9           NA</span></span>
<span id="cb4-36"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 10 VAN BUREN AND WE…      2054 2400 W…     7    10  2014         11           NA</span></span>
<span id="cb4-37"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 135,770 more rows</span></span>
<span id="cb4-38"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # ℹ 5 more variables: y_coordinate &lt;dbl&gt;, latitude &lt;dbl&gt;, longitude &lt;dbl&gt;,</span></span>
<span id="cb4-39"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## #   location &lt;chr&gt;, date &lt;dbl&gt;</span></span></code></pre></div></div>
</details>
</div>
<p>Now, I replot the initial graph.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(red_light_popular, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(date, violations)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb5-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> intersection))</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Decreasing the sample reduced the number of intersections that were lumped at the bottom, but there are still a lot of data there. Let’s try adding some transparency and a log10 transformation.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(red_light_popular, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(date, violations)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> intersection), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_log10</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb6-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'</span></span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>The transformation does a fairly good job of shrinking the high violation intersections down and scaling the low violation intersections up. Now, it looks like there might be a reoccurring pattern occurring within a year. So, I’m wondering if there is a seasonal trend. To look at this, let’s focus on a single intersection for the moment: Lake Shore Dr.&nbsp;and Belmont.</p>
</section>
<section id="whats-the-pattern" class="level2">
<h2 class="anchored" data-anchor-id="whats-the-pattern">What’s the pattern?</h2>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1">lsd_belmont <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> red_light_popular <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(intersection <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"LAKE SHORE DR AND BELMONT"</span>)</span>
<span id="cb7-3"></span>
<span id="cb7-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(lsd_belmont, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(date, violations)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> intersection)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_jitter</span>()</span>
<span id="cb7-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'</span></span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Now, let’s see how the monthly patterns change by year for the same intersection.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(lsd_belmont, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(month, violations)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> year)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_jitter</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb8-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'</span></span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Both of these graphs supports the seasonal trend idea–there is a bump in red light violations every summer, between May and August.</p>
<p>Now, I have a few questions:</p>
<ul>
<li>Are these patterns the same for all intersections?</li>
<li>What’s driving these peaks? More drivers/tourists on the road in summer months?</li>
<li>What’s producing the gap between the high violations (above 25 violations) and low violations (below 25 violations)?</li>
<li>Is this pattern more pronounced in certain parts of the city (e.g., Lake Shore and Belmont is by a popular highway)?</li>
<li>What happened in the summer/fall of 2016?</li>
</ul>
</section>
<section id="can-we-remove-this-pattern" class="level2">
<h2 class="anchored" data-anchor-id="can-we-remove-this-pattern">Can we remove this pattern?</h2>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1">belmont_mod <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lm</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(violations) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(month), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> lsd_belmont)</span>
<span id="cb9-2"></span>
<span id="cb9-3">lsd_belmont <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_predictions</span>(belmont_mod) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(date, pred)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>()</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This shows the model has captured the seasonal pattern, but plotting the residuals will probably be more useful.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1">lsd_belmont <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_residuals</span>(belmont_mod) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(date, resid)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_hline</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yintercept =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"white"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb10-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.</span></span>
<span id="cb10-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Please use `linewidth` instead.</span></span>
<span id="cb10-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'</span></span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-11-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>It looks like the seasonal model does a good job of explaining the data, especially for 2017 on. And, by removing the strong monthly pattern, we can see the long-term trends much more clearly. There’s a steady increase from the beginning of the data to about 2016 when the number of red light violations peak. They then take a slight dip but remain relatively stable through the present.</p>
<p>More questions:</p>
<ul>
<li>What’s driving this trend?</li>
<li>What happened around 2016?</li>
<li>Is this pattern the same for all intersections?</li>
</ul>
</section>
<section id="all-intersections" class="level2">
<h2 class="anchored" data-anchor-id="all-intersections">All intersections</h2>
<p>Now, I want to extend this model to all of the intersections in the sample.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1">by_intersection <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> red_light_popular <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(intersection) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">nest</span>()</span>
<span id="cb11-4"></span>
<span id="cb11-5">intersection_model <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(df) {</span>
<span id="cb11-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lm</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log10</span>(violations) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(month), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> df)</span>
<span id="cb11-7">}</span>
<span id="cb11-8"></span>
<span id="cb11-9">partioned <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> by_intersection <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(</span>
<span id="cb11-11">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">model =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(data, intersection_model),</span>
<span id="cb11-12">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resids =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map2</span>(data, model, add_residuals)</span>
<span id="cb11-13">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-14">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unnest</span>(resids)</span>
<span id="cb11-15"></span>
<span id="cb11-16"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(partioned, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(date, resid)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-17">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> intersection), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-18">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stat_summary</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">geom =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"line"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fun.y =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-19">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stat_summary</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">geom =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"line"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fun.y =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-20">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">linetype =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span>
<span id="cb11-21"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Warning: The `fun.y` argument of `stat_summary()` is deprecated as of ggplot2 3.3.0.</span></span>
<span id="cb11-22"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## ℹ Please use the `fun` argument instead.</span></span>
<span id="cb11-23"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'</span></span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/index_files/figure-html/unnamed-chunk-12-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>After removing the seasonal pattern, the long-term trend becomes much more stable. However, some unexplained patterns remain.</p>
<p>Further questions:</p>
<ul>
<li>What drove the increase in violations between 2014 and mid-2016?</li>
<li>What happened at the end of 2016/beginning of 2017?</li>
<li>Do violations occur systematically throughout the city, or are certain locations more likely to lead to higher violations?</li>
</ul>


</section>
</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2019,
  author = {},
  title = {EDA: {Chicago} Red Light Camera Violations},
  date = {2019-04-29},
  url = {https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2019" class="csl-entry quarto-appendix-citeas">
<span>“EDA: Chicago Red Light Camera Violations.”</span> 2019. April 29,
2019. <a href="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/">https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/</a>.
</div></div></section></div> ]]></description>
  <category>Data Science</category>
  <guid>https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/</guid>
  <pubDate>Mon, 29 Apr 2019 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2019-04-29_chicago-red-light/featured.png" medium="image" type="image/png" height="95" width="144"/>
</item>
<item>
  <title>First TidyTuesday submission</title>
  <link>https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/</link>
  <description><![CDATA[ 




<p>It’s been quite some time since I’ve written here, so I thought I would use one my of <a href="https://twitter.com/_jwinget/status/1079437460357218304">2019 #rstats goals</a> as an excuse to brush off the dust.</p>
<p>In this post, I write about my <strong>first</strong> <a href="https://github.com/rfordatascience/tidytuesday/tree/master/data/2019/2019-01-08">#tidytuesday</a> submission of the Economist’s “TV’s golden age is real” data set (original #tidytuesday code <a href="https://gitlab.com/jrwinget/tidy-tuesday/blob/master/2019-01-08_tv-golden-age.Rmd">here</a>). I also make a few improvements to some of the graphs and add tables with the <code>gt</code> package.</p>
<p>Special thanks to <a href="https://twitter.com/IsabellaGhement">Isabella Ghement</a> for providing <a href="https://twitter.com/IsabellaGhement/status/1082831866523086849">a few tips</a> on how to improve the original graphs!</p>
<p>First, load the required packages and data:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(lubridate)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggpmisc)</span>
<span id="cb1-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggrepel)</span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(gt)</span>
<span id="cb1-6">tv_rating <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read_csv</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-01-08/IMDb_Economist_tv_ratings.csv"</span>)</span></code></pre></div></div>
</details>
</div>
<section id="which-years-had-the-highest-ratings" class="level3">
<h3 class="anchored" data-anchor-id="which-years-had-the-highest-ratings">Which years had the highest ratings?</h3>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1">tv_rating <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">year =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">year</span>(date)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(year) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarize</span>(</span>
<span id="cb2-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>(),</span>
<span id="cb2-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_rating =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(av_rating)</span>
<span id="cb2-7">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">desc</span>(avg_rating)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(year, avg_rating) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-13">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(</span>
<span id="cb2-14">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formula =</span> y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x,</span>
<span id="cb2-15">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lm"</span>,</span>
<span id="cb2-16">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>,</span>
<span id="cb2-17">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span></span>
<span id="cb2-18">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-19">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(</span>
<span id="cb2-20">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Which years had the highest ratings?"</span>,</span>
<span id="cb2-21">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Year"</span>,</span>
<span id="cb2-22">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Average rating"</span></span>
<span id="cb2-23">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-24">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stat_poly_eq</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"atop("</span>, ..eq.label.., <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">","</span>, ..adj.rr.label.., <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">")"</span>)),</span>
<span id="cb2-25">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formula =</span> y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parse =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb2-26">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-27">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_light</span>()</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Looks like the newer the TV drama, the more likely it was to have a higher rating. Maybe some of this is variance is due to shows with multiple seasons. Let’s see how this changes when looking at individual shows and their respective run lengths.</p>
</section>
<section id="which-titles-had-the-highest-ratings-and-how-long-did-they-run" class="level3">
<h3 class="anchored" data-anchor-id="which-titles-had-the-highest-ratings-and-how-long-did-they-run">Which titles had the highest ratings, and how long did they run?</h3>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1">titles <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> tv_rating <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(title) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarize</span>(</span>
<span id="cb3-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>(),</span>
<span id="cb3-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">first_yr =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">min</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">year</span>(date)),</span>
<span id="cb3-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">last_yr =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">max</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">year</span>(date)),</span>
<span id="cb3-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">num_seasons =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">max</span>(seasonNumber),</span>
<span id="cb3-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yrs_aired =</span> (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">max</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">year</span>(date) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">min</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">year</span>(date)))),</span>
<span id="cb3-9">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_rating =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(av_rating)</span>
<span id="cb3-10">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># not enough cases to filter by 25 ratings per title</span></span>
<span id="cb3-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">desc</span>(avg_rating))</span>
<span id="cb3-13"></span>
<span id="cb3-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># highest rated titles' run lengths</span></span>
<span id="cb3-15">titles <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-16">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fct_reorder</span>(title, yrs_aired)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-17">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-18">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(title, yrs_aired, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> title) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-19">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_col</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-20">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_flip</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-21">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(</span>
<span id="cb3-22">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Most popular series' run lengths"</span>,</span>
<span id="cb3-23">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"TV Series Title"</span>,</span>
<span id="cb3-24">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Number of years aired"</span></span>
<span id="cb3-25">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-26">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_light</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-27">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"none"</span>)</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"></span>
<span id="cb4-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># highest rated titles rating over time</span></span>
<span id="cb4-3">titles <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(yrs_aired, avg_rating) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(</span>
<span id="cb4-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formula =</span> y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x,</span>
<span id="cb4-9">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lm"</span>,</span>
<span id="cb4-10">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>,</span>
<span id="cb4-11">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span></span>
<span id="cb4-12">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-13">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(</span>
<span id="cb4-14">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Most popular series' ratings over time"</span>,</span>
<span id="cb4-15">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Number of years ran"</span>,</span>
<span id="cb4-16">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Average rating"</span></span>
<span id="cb4-17">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-18">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stat_poly_eq</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"atop("</span>, ..eq.label.., <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">","</span>, ..adj.rr.label.., <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">")"</span>)),</span>
<span id="cb4-19">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formula =</span> y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parse =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb4-20">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-21">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_light</span>()</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/index_files/figure-html/unnamed-chunk-3-2.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>I haven’t seen all of these shows, but for the most part, their titles seem to describe suspenseful dramas (e.g., mystery, crime, maybe even thriller/horror). However, King of the Hill doesn’t really fit this description, so let’s take a look at the genre variable. Keeping in mind that all of these shows are dramas, I’m conceptualizing these as sub-genres of drama.</p>
</section>
<section id="which-sub-genres-are-most-popular" class="level3">
<h3 class="anchored" data-anchor-id="which-sub-genres-are-most-popular">Which sub-genres are most popular?</h3>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1">top_ratings <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(df, x) {</span>
<span id="cb5-2">  df <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarize</span>(</span>
<span id="cb5-4">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>(),</span>
<span id="cb5-5">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_rating =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(av_rating),</span>
<span id="cb5-6">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">med_rating =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">median</span>(av_rating)</span>
<span id="cb5-7">    ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">desc</span>(med_rating))</span>
<span id="cb5-10">}</span>
<span id="cb5-11"></span>
<span id="cb5-12">tv_rating <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-13">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(genres) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-14">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">top_ratings</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">genres =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_replace_all</span>(genres, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">","</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">", "</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-16">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gt</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-17">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tab_header</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Which drama sub-genres are most popular?"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-18">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fmt_number</span>(</span>
<span id="cb5-19">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">columns =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vars</span>(avg_rating, med_rating),</span>
<span id="cb5-20">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">decimals =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span></span>
<span id="cb5-21">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-22">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cols_label</span>(</span>
<span id="cb5-23">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">genres =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sub-genres"</span>,</span>
<span id="cb5-24">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Number of responses"</span>,</span>
<span id="cb5-25">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_rating =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Average rating"</span>,</span>
<span id="cb5-26">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">med_rating =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Median rating"</span></span>
<span id="cb5-27">  )</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div id="krrbjkttos" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#krrbjkttos table {
  font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
  -webkit-font-smoothing: antialiased;
  -moz-osx-font-smoothing: grayscale;
}

#krrbjkttos thead, #krrbjkttos tbody, #krrbjkttos tfoot, #krrbjkttos tr, #krrbjkttos td, #krrbjkttos th {
  border-style: none;
}

#krrbjkttos p {
  margin: 0;
  padding: 0;
}

#krrbjkttos .gt_table {
  display: table;
  border-collapse: collapse;
  line-height: normal;
  margin-left: auto;
  margin-right: auto;
  color: #333333;
  font-size: 16px;
  font-weight: normal;
  font-style: normal;
  background-color: #FFFFFF;
  width: auto;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #A8A8A8;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #A8A8A8;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
}

#krrbjkttos .gt_caption {
  padding-top: 4px;
  padding-bottom: 4px;
}

#krrbjkttos .gt_title {
  color: #333333;
  font-size: 125%;
  font-weight: initial;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-color: #FFFFFF;
  border-bottom-width: 0;
}

#krrbjkttos .gt_subtitle {
  color: #333333;
  font-size: 85%;
  font-weight: initial;
  padding-top: 3px;
  padding-bottom: 5px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-color: #FFFFFF;
  border-top-width: 0;
}

#krrbjkttos .gt_heading {
  background-color: #FFFFFF;
  text-align: center;
  border-bottom-color: #FFFFFF;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#krrbjkttos .gt_bottom_border {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#krrbjkttos .gt_col_headings {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#krrbjkttos .gt_col_heading {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 6px;
  padding-left: 5px;
  padding-right: 5px;
  overflow-x: hidden;
}

#krrbjkttos .gt_column_spanner_outer {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  padding-top: 0;
  padding-bottom: 0;
  padding-left: 4px;
  padding-right: 4px;
}

#krrbjkttos .gt_column_spanner_outer:first-child {
  padding-left: 0;
}

#krrbjkttos .gt_column_spanner_outer:last-child {
  padding-right: 0;
}

#krrbjkttos .gt_column_spanner {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 5px;
  overflow-x: hidden;
  display: inline-block;
  width: 100%;
}

#krrbjkttos .gt_spanner_row {
  border-bottom-style: hidden;
}

#krrbjkttos .gt_group_heading {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  text-align: left;
}

#krrbjkttos .gt_empty_group_heading {
  padding: 0.5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: middle;
}

#krrbjkttos .gt_from_md > :first-child {
  margin-top: 0;
}

#krrbjkttos .gt_from_md > :last-child {
  margin-bottom: 0;
}

#krrbjkttos .gt_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  margin: 10px;
  border-top-style: solid;
  border-top-width: 1px;
  border-top-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  overflow-x: hidden;
}

#krrbjkttos .gt_stub {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
}

#krrbjkttos .gt_stub_row_group {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
  vertical-align: top;
}

#krrbjkttos .gt_row_group_first td {
  border-top-width: 2px;
}

#krrbjkttos .gt_row_group_first th {
  border-top-width: 2px;
}

#krrbjkttos .gt_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#krrbjkttos .gt_first_summary_row {
  border-top-style: solid;
  border-top-color: #D3D3D3;
}

#krrbjkttos .gt_first_summary_row.thick {
  border-top-width: 2px;
}

#krrbjkttos .gt_last_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#krrbjkttos .gt_grand_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#krrbjkttos .gt_first_grand_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-style: double;
  border-top-width: 6px;
  border-top-color: #D3D3D3;
}

#krrbjkttos .gt_last_grand_summary_row_top {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: double;
  border-bottom-width: 6px;
  border-bottom-color: #D3D3D3;
}

#krrbjkttos .gt_striped {
  background-color: rgba(128, 128, 128, 0.05);
}

#krrbjkttos .gt_table_body {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#krrbjkttos .gt_footnotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#krrbjkttos .gt_footnote {
  margin: 0px;
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#krrbjkttos .gt_sourcenotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#krrbjkttos .gt_sourcenote {
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#krrbjkttos .gt_left {
  text-align: left;
}

#krrbjkttos .gt_center {
  text-align: center;
}

#krrbjkttos .gt_right {
  text-align: right;
  font-variant-numeric: tabular-nums;
}

#krrbjkttos .gt_font_normal {
  font-weight: normal;
}

#krrbjkttos .gt_font_bold {
  font-weight: bold;
}

#krrbjkttos .gt_font_italic {
  font-style: italic;
}

#krrbjkttos .gt_super {
  font-size: 65%;
}

#krrbjkttos .gt_footnote_marks {
  font-size: 75%;
  vertical-align: 0.4em;
  position: initial;
}

#krrbjkttos .gt_asterisk {
  font-size: 100%;
  vertical-align: 0;
}

#krrbjkttos .gt_indent_1 {
  text-indent: 5px;
}

#krrbjkttos .gt_indent_2 {
  text-indent: 10px;
}

#krrbjkttos .gt_indent_3 {
  text-indent: 15px;
}

#krrbjkttos .gt_indent_4 {
  text-indent: 20px;
}

#krrbjkttos .gt_indent_5 {
  text-indent: 25px;
}

#krrbjkttos .katex-display {
  display: inline-flex !important;
  margin-bottom: 0.75em !important;
}

#krrbjkttos div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
  height: 0px !important;
}
</style>

<table class="gt_table caption-top table table-sm table-striped small" data-quarto-bootstrap="false">
<thead>
<tr class="gt_heading header">
<td colspan="4" class="gt_heading gt_title gt_font_normal gt_bottom_border">Which drama sub-genres are most popular?</td>
</tr>
<tr class="gt_col_headings even">
<th id="genres" class="gt_col_heading gt_columns_bottom_border gt_left" data-quarto-table-cell-role="th" scope="col">Sub-genres</th>
<th id="n" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Number of responses</th>
<th id="avg_rating" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Average rating</th>
<th id="med_rating" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Median rating</th>
</tr>
</thead>
<tbody class="gt_table_body">
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Drama, Fantasy, Horror</td>
<td class="gt_row gt_right" headers="n">56</td>
<td class="gt_row gt_right" headers="avg_rating">8.341</td>
<td class="gt_row gt_right" headers="med_rating">8.505</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Crime, Drama, Thriller</td>
<td class="gt_row gt_right" headers="n">63</td>
<td class="gt_row gt_right" headers="avg_rating">8.390</td>
<td class="gt_row gt_right" headers="med_rating">8.409</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Action, Crime, Drama</td>
<td class="gt_row gt_right" headers="n">146</td>
<td class="gt_row gt_right" headers="avg_rating">8.156</td>
<td class="gt_row gt_right" headers="med_rating">8.282</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Crime, Drama</td>
<td class="gt_row gt_right" headers="n">107</td>
<td class="gt_row gt_right" headers="avg_rating">8.267</td>
<td class="gt_row gt_right" headers="med_rating">8.268</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Drama, Thriller</td>
<td class="gt_row gt_right" headers="n">27</td>
<td class="gt_row gt_right" headers="avg_rating">8.028</td>
<td class="gt_row gt_right" headers="med_rating">8.192</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Drama, Fantasy, Mystery</td>
<td class="gt_row gt_right" headers="n">32</td>
<td class="gt_row gt_right" headers="avg_rating">8.143</td>
<td class="gt_row gt_right" headers="med_rating">8.162</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Drama</td>
<td class="gt_row gt_right" headers="n">168</td>
<td class="gt_row gt_right" headers="avg_rating">8.001</td>
<td class="gt_row gt_right" headers="med_rating">8.160</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Adventure, Drama, Fantasy</td>
<td class="gt_row gt_right" headers="n">27</td>
<td class="gt_row gt_right" headers="avg_rating">8.107</td>
<td class="gt_row gt_right" headers="med_rating">8.145</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Drama, Mystery, Sci-Fi</td>
<td class="gt_row gt_right" headers="n">58</td>
<td class="gt_row gt_right" headers="avg_rating">8.061</td>
<td class="gt_row gt_right" headers="med_rating">8.113</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Comedy, Drama, Family</td>
<td class="gt_row gt_right" headers="n">43</td>
<td class="gt_row gt_right" headers="avg_rating">8.008</td>
<td class="gt_row gt_right" headers="med_rating">8.110</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Comedy, Crime, Drama</td>
<td class="gt_row gt_right" headers="n">80</td>
<td class="gt_row gt_right" headers="avg_rating">8.022</td>
<td class="gt_row gt_right" headers="med_rating">8.094</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Comedy, Drama</td>
<td class="gt_row gt_right" headers="n">174</td>
<td class="gt_row gt_right" headers="avg_rating">8.021</td>
<td class="gt_row gt_right" headers="med_rating">8.087</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Crime, Drama, Mystery</td>
<td class="gt_row gt_right" headers="n">369</td>
<td class="gt_row gt_right" headers="avg_rating">7.991</td>
<td class="gt_row gt_right" headers="med_rating">8.049</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Action, Adventure, Drama</td>
<td class="gt_row gt_right" headers="n">112</td>
<td class="gt_row gt_right" headers="avg_rating">8.020</td>
<td class="gt_row gt_right" headers="med_rating">7.975</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Comedy, Drama, Romance</td>
<td class="gt_row gt_right" headers="n">76</td>
<td class="gt_row gt_right" headers="avg_rating">7.973</td>
<td class="gt_row gt_right" headers="med_rating">7.962</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Action, Drama, Sci-Fi</td>
<td class="gt_row gt_right" headers="n">28</td>
<td class="gt_row gt_right" headers="avg_rating">8.046</td>
<td class="gt_row gt_right" headers="med_rating">7.943</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Animation, Comedy, Drama</td>
<td class="gt_row gt_right" headers="n">28</td>
<td class="gt_row gt_right" headers="avg_rating">8.040</td>
<td class="gt_row gt_right" headers="med_rating">7.918</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Drama, Romance</td>
<td class="gt_row gt_right" headers="n">86</td>
<td class="gt_row gt_right" headers="avg_rating">7.834</td>
<td class="gt_row gt_right" headers="med_rating">7.876</td>
</tr>
</tbody>
</table>

</div>
</div>
</div>
<p>These results provide some evidence of my initial impression: People tend to give higher ratings to suspenseful-like dramas (e.g., crime, thriller, horror, mystery, action). But, there’s not much variability between the values. This might be because many of the values in ‘genres’ are grouped together. Combining genres like this could hide underlying patterns among the sub-genres, so let’s split the genres variable up such that each sub-genre has its own row.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1">genre_split <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> tv_rating <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb6-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">genres =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_split</span>(genres, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pattern =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">","</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb6-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unnest</span>()</span></code></pre></div></div>
</details>
</div>
</section>
<section id="after-splitting-up-genres-which-sub-genres-are-most-popular" class="level3">
<h3 class="anchored" data-anchor-id="after-splitting-up-genres-which-sub-genres-are-most-popular">After splitting up ‘genres’, which sub-genres are most popular?</h3>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1">genre_split <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(genres) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">top_ratings</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gt</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tab_header</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Which sub-genres are most popular?"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fmt_number</span>(</span>
<span id="cb7-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">columns =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vars</span>(avg_rating, med_rating),</span>
<span id="cb7-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">decimals =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span></span>
<span id="cb7-9">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cols_label</span>(</span>
<span id="cb7-11">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">genres =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sub-genres"</span>,</span>
<span id="cb7-12">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Number of responses"</span>,</span>
<span id="cb7-13">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_rating =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Average rating"</span>,</span>
<span id="cb7-14">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">med_rating =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Median rating"</span></span>
<span id="cb7-15">  )</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div id="xzybrspoct" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#xzybrspoct table {
  font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
  -webkit-font-smoothing: antialiased;
  -moz-osx-font-smoothing: grayscale;
}

#xzybrspoct thead, #xzybrspoct tbody, #xzybrspoct tfoot, #xzybrspoct tr, #xzybrspoct td, #xzybrspoct th {
  border-style: none;
}

#xzybrspoct p {
  margin: 0;
  padding: 0;
}

#xzybrspoct .gt_table {
  display: table;
  border-collapse: collapse;
  line-height: normal;
  margin-left: auto;
  margin-right: auto;
  color: #333333;
  font-size: 16px;
  font-weight: normal;
  font-style: normal;
  background-color: #FFFFFF;
  width: auto;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #A8A8A8;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #A8A8A8;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
}

#xzybrspoct .gt_caption {
  padding-top: 4px;
  padding-bottom: 4px;
}

#xzybrspoct .gt_title {
  color: #333333;
  font-size: 125%;
  font-weight: initial;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-color: #FFFFFF;
  border-bottom-width: 0;
}

#xzybrspoct .gt_subtitle {
  color: #333333;
  font-size: 85%;
  font-weight: initial;
  padding-top: 3px;
  padding-bottom: 5px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-color: #FFFFFF;
  border-top-width: 0;
}

#xzybrspoct .gt_heading {
  background-color: #FFFFFF;
  text-align: center;
  border-bottom-color: #FFFFFF;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#xzybrspoct .gt_bottom_border {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#xzybrspoct .gt_col_headings {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#xzybrspoct .gt_col_heading {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 6px;
  padding-left: 5px;
  padding-right: 5px;
  overflow-x: hidden;
}

#xzybrspoct .gt_column_spanner_outer {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  padding-top: 0;
  padding-bottom: 0;
  padding-left: 4px;
  padding-right: 4px;
}

#xzybrspoct .gt_column_spanner_outer:first-child {
  padding-left: 0;
}

#xzybrspoct .gt_column_spanner_outer:last-child {
  padding-right: 0;
}

#xzybrspoct .gt_column_spanner {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 5px;
  overflow-x: hidden;
  display: inline-block;
  width: 100%;
}

#xzybrspoct .gt_spanner_row {
  border-bottom-style: hidden;
}

#xzybrspoct .gt_group_heading {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  text-align: left;
}

#xzybrspoct .gt_empty_group_heading {
  padding: 0.5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: middle;
}

#xzybrspoct .gt_from_md > :first-child {
  margin-top: 0;
}

#xzybrspoct .gt_from_md > :last-child {
  margin-bottom: 0;
}

#xzybrspoct .gt_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  margin: 10px;
  border-top-style: solid;
  border-top-width: 1px;
  border-top-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  overflow-x: hidden;
}

#xzybrspoct .gt_stub {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
}

#xzybrspoct .gt_stub_row_group {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
  vertical-align: top;
}

#xzybrspoct .gt_row_group_first td {
  border-top-width: 2px;
}

#xzybrspoct .gt_row_group_first th {
  border-top-width: 2px;
}

#xzybrspoct .gt_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#xzybrspoct .gt_first_summary_row {
  border-top-style: solid;
  border-top-color: #D3D3D3;
}

#xzybrspoct .gt_first_summary_row.thick {
  border-top-width: 2px;
}

#xzybrspoct .gt_last_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#xzybrspoct .gt_grand_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#xzybrspoct .gt_first_grand_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-style: double;
  border-top-width: 6px;
  border-top-color: #D3D3D3;
}

#xzybrspoct .gt_last_grand_summary_row_top {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: double;
  border-bottom-width: 6px;
  border-bottom-color: #D3D3D3;
}

#xzybrspoct .gt_striped {
  background-color: rgba(128, 128, 128, 0.05);
}

#xzybrspoct .gt_table_body {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#xzybrspoct .gt_footnotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#xzybrspoct .gt_footnote {
  margin: 0px;
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#xzybrspoct .gt_sourcenotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#xzybrspoct .gt_sourcenote {
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#xzybrspoct .gt_left {
  text-align: left;
}

#xzybrspoct .gt_center {
  text-align: center;
}

#xzybrspoct .gt_right {
  text-align: right;
  font-variant-numeric: tabular-nums;
}

#xzybrspoct .gt_font_normal {
  font-weight: normal;
}

#xzybrspoct .gt_font_bold {
  font-weight: bold;
}

#xzybrspoct .gt_font_italic {
  font-style: italic;
}

#xzybrspoct .gt_super {
  font-size: 65%;
}

#xzybrspoct .gt_footnote_marks {
  font-size: 75%;
  vertical-align: 0.4em;
  position: initial;
}

#xzybrspoct .gt_asterisk {
  font-size: 100%;
  vertical-align: 0;
}

#xzybrspoct .gt_indent_1 {
  text-indent: 5px;
}

#xzybrspoct .gt_indent_2 {
  text-indent: 10px;
}

#xzybrspoct .gt_indent_3 {
  text-indent: 15px;
}

#xzybrspoct .gt_indent_4 {
  text-indent: 20px;
}

#xzybrspoct .gt_indent_5 {
  text-indent: 25px;
}

#xzybrspoct .katex-display {
  display: inline-flex !important;
  margin-bottom: 0.75em !important;
}

#xzybrspoct div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
  height: 0px !important;
}
</style>

<table class="gt_table caption-top table table-sm table-striped small" data-quarto-bootstrap="false">
<thead>
<tr class="gt_heading header">
<td colspan="4" class="gt_heading gt_title gt_font_normal gt_bottom_border">Which sub-genres are most popular?</td>
</tr>
<tr class="gt_col_headings even">
<th id="genres" class="gt_col_heading gt_columns_bottom_border gt_left" data-quarto-table-cell-role="th" scope="col">Sub-genres</th>
<th id="n" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Number of responses</th>
<th id="avg_rating" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Average rating</th>
<th id="med_rating" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Median rating</th>
</tr>
</thead>
<tbody class="gt_table_body">
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Sport</td>
<td class="gt_row gt_right" headers="n">29</td>
<td class="gt_row gt_right" headers="avg_rating">8.339</td>
<td class="gt_row gt_right" headers="med_rating">8.381</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">History</td>
<td class="gt_row gt_right" headers="n">62</td>
<td class="gt_row gt_right" headers="avg_rating">8.274</td>
<td class="gt_row gt_right" headers="med_rating">8.343</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Music</td>
<td class="gt_row gt_right" headers="n">32</td>
<td class="gt_row gt_right" headers="avg_rating">8.186</td>
<td class="gt_row gt_right" headers="med_rating">8.291</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Thriller</td>
<td class="gt_row gt_right" headers="n">160</td>
<td class="gt_row gt_right" headers="avg_rating">8.169</td>
<td class="gt_row gt_right" headers="med_rating">8.256</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Fantasy</td>
<td class="gt_row gt_right" headers="n">223</td>
<td class="gt_row gt_right" headers="avg_rating">8.197</td>
<td class="gt_row gt_right" headers="med_rating">8.212</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Horror</td>
<td class="gt_row gt_right" headers="n">124</td>
<td class="gt_row gt_right" headers="avg_rating">8.093</td>
<td class="gt_row gt_right" headers="med_rating">8.211</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Family</td>
<td class="gt_row gt_right" headers="n">76</td>
<td class="gt_row gt_right" headers="avg_rating">8.063</td>
<td class="gt_row gt_right" headers="med_rating">8.179</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Crime</td>
<td class="gt_row gt_right" headers="n">822</td>
<td class="gt_row gt_right" headers="avg_rating">8.101</td>
<td class="gt_row gt_right" headers="med_rating">8.144</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Drama</td>
<td class="gt_row gt_right" headers="n">2266</td>
<td class="gt_row gt_right" headers="avg_rating">8.061</td>
<td class="gt_row gt_right" headers="med_rating">8.115</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Mystery</td>
<td class="gt_row gt_right" headers="n">558</td>
<td class="gt_row gt_right" headers="avg_rating">8.020</td>
<td class="gt_row gt_right" headers="med_rating">8.099</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Action</td>
<td class="gt_row gt_right" headers="n">387</td>
<td class="gt_row gt_right" headers="avg_rating">8.085</td>
<td class="gt_row gt_right" headers="med_rating">8.099</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Comedy</td>
<td class="gt_row gt_right" headers="n">516</td>
<td class="gt_row gt_right" headers="avg_rating">8.040</td>
<td class="gt_row gt_right" headers="med_rating">8.074</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Biography</td>
<td class="gt_row gt_right" headers="n">29</td>
<td class="gt_row gt_right" headers="avg_rating">8.111</td>
<td class="gt_row gt_right" headers="med_rating">8.072</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Adventure</td>
<td class="gt_row gt_right" headers="n">204</td>
<td class="gt_row gt_right" headers="avg_rating">8.024</td>
<td class="gt_row gt_right" headers="med_rating">8.033</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Romance</td>
<td class="gt_row gt_right" headers="n">235</td>
<td class="gt_row gt_right" headers="avg_rating">7.976</td>
<td class="gt_row gt_right" headers="med_rating">7.997</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="genres">Sci-Fi</td>
<td class="gt_row gt_right" headers="n">154</td>
<td class="gt_row gt_right" headers="avg_rating">7.925</td>
<td class="gt_row gt_right" headers="med_rating">7.927</td>
</tr>
<tr class="odd">
<td class="gt_row gt_left" headers="genres">Animation</td>
<td class="gt_row gt_right" headers="n">36</td>
<td class="gt_row gt_right" headers="avg_rating">8.002</td>
<td class="gt_row gt_right" headers="med_rating">7.891</td>
</tr>
</tbody>
</table>

</div>
</div>
</div>
<p>Still, very little variability between sub-genres. The overall difference between the the highest (‘Sport’) and lowest (‘Animation’) rating is around 0.49. Nevertheless, these results paint a different picture than when all of the sub-genres were grouped together.</p>
<p>Now, it looks like sports, history, and music are the highest rated sub-genres, and not the suspenseful ones (i.e., crime, thriller, and horror) we saw earlier. To be fair, these new sub-genres could very well be suspenseful, but they seem to be of a slightly different “theme” than the former ones.</p>
<p>This raises a good point: Within the dramas, there are different types, and these types could be valued for different reasons. For example, comedy dramas might be valued for certain positive connotations (e.g., laughter), whereas a crime drama might be valued for certain negative connotations (e.g., fear). So, perhaps there is a difference between comedies (defined as comedy and animation) and tragedies (defined as crime, horror, and thriller). Granted, these definitions could be debated/refined, but they should provide a rough snapshot of the idea.</p>
</section>
<section id="is-there-a-difference-in-ratings-between-comedies-and-tragedies" class="level3">
<h3 class="anchored" data-anchor-id="is-there-a-difference-in-ratings-between-comedies-and-tragedies">Is there a difference in ratings between comedies and tragedies?</h3>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1">genre_split <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(</span>
<span id="cb8-3">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">com_trag =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">case_when</span>(</span>
<span id="cb8-4">      genres <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Comedy"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb8-5">        genres <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Animation"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"comedy"</span>,</span>
<span id="cb8-6">      genres <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Crime"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb8-7">        genres <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Horror"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span></span>
<span id="cb8-8">        genres <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Thriller"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tragedy"</span></span>
<span id="cb8-9">    )</span>
<span id="cb8-10">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is.na</span>(com_trag)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(com_trag) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-13">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">top_ratings</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-14">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">com_trag =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_to_title</span>(com_trag)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gt</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-16">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tab_header</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Mean/median differences between comedies and tragedies"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-17">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fmt_number</span>(</span>
<span id="cb8-18">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">columns =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vars</span>(avg_rating, med_rating),</span>
<span id="cb8-19">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">decimals =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span></span>
<span id="cb8-20">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-21">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cols_label</span>(</span>
<span id="cb8-22">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">com_trag =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Drama type"</span>,</span>
<span id="cb8-23">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Number of responses"</span>,</span>
<span id="cb8-24">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">avg_rating =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Average rating"</span>,</span>
<span id="cb8-25">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">med_rating =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Median rating"</span></span>
<span id="cb8-26">  )</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div id="ygpcwrvjtw" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#ygpcwrvjtw table {
  font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
  -webkit-font-smoothing: antialiased;
  -moz-osx-font-smoothing: grayscale;
}

#ygpcwrvjtw thead, #ygpcwrvjtw tbody, #ygpcwrvjtw tfoot, #ygpcwrvjtw tr, #ygpcwrvjtw td, #ygpcwrvjtw th {
  border-style: none;
}

#ygpcwrvjtw p {
  margin: 0;
  padding: 0;
}

#ygpcwrvjtw .gt_table {
  display: table;
  border-collapse: collapse;
  line-height: normal;
  margin-left: auto;
  margin-right: auto;
  color: #333333;
  font-size: 16px;
  font-weight: normal;
  font-style: normal;
  background-color: #FFFFFF;
  width: auto;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #A8A8A8;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #A8A8A8;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
}

#ygpcwrvjtw .gt_caption {
  padding-top: 4px;
  padding-bottom: 4px;
}

#ygpcwrvjtw .gt_title {
  color: #333333;
  font-size: 125%;
  font-weight: initial;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-color: #FFFFFF;
  border-bottom-width: 0;
}

#ygpcwrvjtw .gt_subtitle {
  color: #333333;
  font-size: 85%;
  font-weight: initial;
  padding-top: 3px;
  padding-bottom: 5px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-color: #FFFFFF;
  border-top-width: 0;
}

#ygpcwrvjtw .gt_heading {
  background-color: #FFFFFF;
  text-align: center;
  border-bottom-color: #FFFFFF;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#ygpcwrvjtw .gt_bottom_border {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#ygpcwrvjtw .gt_col_headings {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#ygpcwrvjtw .gt_col_heading {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 6px;
  padding-left: 5px;
  padding-right: 5px;
  overflow-x: hidden;
}

#ygpcwrvjtw .gt_column_spanner_outer {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  padding-top: 0;
  padding-bottom: 0;
  padding-left: 4px;
  padding-right: 4px;
}

#ygpcwrvjtw .gt_column_spanner_outer:first-child {
  padding-left: 0;
}

#ygpcwrvjtw .gt_column_spanner_outer:last-child {
  padding-right: 0;
}

#ygpcwrvjtw .gt_column_spanner {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 5px;
  overflow-x: hidden;
  display: inline-block;
  width: 100%;
}

#ygpcwrvjtw .gt_spanner_row {
  border-bottom-style: hidden;
}

#ygpcwrvjtw .gt_group_heading {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  text-align: left;
}

#ygpcwrvjtw .gt_empty_group_heading {
  padding: 0.5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: middle;
}

#ygpcwrvjtw .gt_from_md > :first-child {
  margin-top: 0;
}

#ygpcwrvjtw .gt_from_md > :last-child {
  margin-bottom: 0;
}

#ygpcwrvjtw .gt_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  margin: 10px;
  border-top-style: solid;
  border-top-width: 1px;
  border-top-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  overflow-x: hidden;
}

#ygpcwrvjtw .gt_stub {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
}

#ygpcwrvjtw .gt_stub_row_group {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
  vertical-align: top;
}

#ygpcwrvjtw .gt_row_group_first td {
  border-top-width: 2px;
}

#ygpcwrvjtw .gt_row_group_first th {
  border-top-width: 2px;
}

#ygpcwrvjtw .gt_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#ygpcwrvjtw .gt_first_summary_row {
  border-top-style: solid;
  border-top-color: #D3D3D3;
}

#ygpcwrvjtw .gt_first_summary_row.thick {
  border-top-width: 2px;
}

#ygpcwrvjtw .gt_last_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#ygpcwrvjtw .gt_grand_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#ygpcwrvjtw .gt_first_grand_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-style: double;
  border-top-width: 6px;
  border-top-color: #D3D3D3;
}

#ygpcwrvjtw .gt_last_grand_summary_row_top {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: double;
  border-bottom-width: 6px;
  border-bottom-color: #D3D3D3;
}

#ygpcwrvjtw .gt_striped {
  background-color: rgba(128, 128, 128, 0.05);
}

#ygpcwrvjtw .gt_table_body {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#ygpcwrvjtw .gt_footnotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#ygpcwrvjtw .gt_footnote {
  margin: 0px;
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#ygpcwrvjtw .gt_sourcenotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#ygpcwrvjtw .gt_sourcenote {
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#ygpcwrvjtw .gt_left {
  text-align: left;
}

#ygpcwrvjtw .gt_center {
  text-align: center;
}

#ygpcwrvjtw .gt_right {
  text-align: right;
  font-variant-numeric: tabular-nums;
}

#ygpcwrvjtw .gt_font_normal {
  font-weight: normal;
}

#ygpcwrvjtw .gt_font_bold {
  font-weight: bold;
}

#ygpcwrvjtw .gt_font_italic {
  font-style: italic;
}

#ygpcwrvjtw .gt_super {
  font-size: 65%;
}

#ygpcwrvjtw .gt_footnote_marks {
  font-size: 75%;
  vertical-align: 0.4em;
  position: initial;
}

#ygpcwrvjtw .gt_asterisk {
  font-size: 100%;
  vertical-align: 0;
}

#ygpcwrvjtw .gt_indent_1 {
  text-indent: 5px;
}

#ygpcwrvjtw .gt_indent_2 {
  text-indent: 10px;
}

#ygpcwrvjtw .gt_indent_3 {
  text-indent: 15px;
}

#ygpcwrvjtw .gt_indent_4 {
  text-indent: 20px;
}

#ygpcwrvjtw .gt_indent_5 {
  text-indent: 25px;
}

#ygpcwrvjtw .katex-display {
  display: inline-flex !important;
  margin-bottom: 0.75em !important;
}

#ygpcwrvjtw div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
  height: 0px !important;
}
</style>

<table class="gt_table caption-top table table-sm table-striped small" data-quarto-bootstrap="false">
<thead>
<tr class="gt_heading header">
<td colspan="4" class="gt_heading gt_title gt_font_normal gt_bottom_border">Mean/median differences between comedies and tragedies</td>
</tr>
<tr class="gt_col_headings even">
<th id="com_trag" class="gt_col_heading gt_columns_bottom_border gt_left" data-quarto-table-cell-role="th" scope="col">Drama type</th>
<th id="n" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Number of responses</th>
<th id="avg_rating" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Average rating</th>
<th id="med_rating" class="gt_col_heading gt_columns_bottom_border gt_right" data-quarto-table-cell-role="th" scope="col">Median rating</th>
</tr>
</thead>
<tbody class="gt_table_body">
<tr class="odd">
<td class="gt_row gt_left" headers="com_trag">Tragedy</td>
<td class="gt_row gt_right" headers="n">1106</td>
<td class="gt_row gt_right" headers="avg_rating">8.110</td>
<td class="gt_row gt_right" headers="med_rating">8.168</td>
</tr>
<tr class="even">
<td class="gt_row gt_left" headers="com_trag">Comedy</td>
<td class="gt_row gt_right" headers="n">552</td>
<td class="gt_row gt_right" headers="avg_rating">8.038</td>
<td class="gt_row gt_right" headers="med_rating">8.071</td>
</tr>
</tbody>
</table>

</div>
</div>
</div>
<p>So, there is a difference, people tend to rate tragedies higher than comedies, but in the grand scheme of things, this difference quite small. The average distance between comedies and tragedies is only 0.07, and the median difference is 0.1. Thus, it seems there’s not much difference in viewer ratings among sub-genres, at least not in our sample. But, this isn’t actually <em>that</em> surprising: Our sample was already narrowed down to TV <em>drama</em> titles. Since all of the titles share this common characteristic, what we’re probably seeing is the consistency of viewers to rate TV dramas in a similar fashion. In other words, people tend to rate all TV dramas similarly, regardless of the story line/sub-genre.</p>
<p>Since sub-genres didn’t bare much useful information, let’s take a look at the actual titles within the dataset. All of the top-rated shows aired for multiple seasons, but I doubt <em>every</em> show that aired multiple seasons was popular. In fact, some earlier analyses showed a decline in ratings over time. So, let’s see what the data say.</p>
</section>
<section id="how-do-viewer-ratings-changes-over-time-by-tv-show-title" class="level3">
<h3 class="anchored" data-anchor-id="how-do-viewer-ratings-changes-over-time-by-tv-show-title">How do viewer ratings changes over time by TV show title</h3>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># list most popular shows from earlier analysis (with extra picks of my own)</span></span>
<span id="cb9-2">shows <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"The X-Files"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Law &amp; Order"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Midsomer Murders"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Law &amp; Order: Special Victims Unit"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ER"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Grey's Anatomy"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CSI: Crime Scene Investigation"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Supernatural"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"King of the Hill"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Doctor Who"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Criminal Minds"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bones"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Murdoch Mysteries"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"American Horror Story"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Are you Afraid of the Dark?"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Californication"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Elementary"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Lost"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Numb3rs"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Shameless"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"The Walking Dead"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"The Sopranos"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Scrubs"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Oz"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"House"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dexter"</span>)</span>
<span id="cb9-3"></span>
<span id="cb9-4">tv_rating <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(title <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> shows) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_replace</span>(title, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Special Victims Unit"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SVU"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(title) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(date, av_rating) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">facet_wrap</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span>title) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(</span>
<span id="cb9-13">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Viewer ratings over time by TV show title"</span>,</span>
<span id="cb9-14">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Years aired"</span>,</span>
<span id="cb9-15">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Average rating"</span></span>
<span id="cb9-16">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-17">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_light</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-18">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axis.text.x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">element_text</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">angle =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>))</span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Okay, now we have some interesting patterns to interpret. Overall, it looks like even the most popular shows experienced a decline in enthusiasm the longer they aired. However, there are notable exceptions to this trend: Criminal Minds and Murdoch Mysteries have really taken off in the last few years, both receiving the highest ratings out of any of the tiles in this sample. American Horror Story appears to be making a comeback as well recently. Noticeably, these exceptions all fit the suspenseful-like dramas noted earlier. It seems, then, the most successful TV dramas are ones with intense or striking elements (e.g., crime, murder, horror, etc.). I wonder if this says anything about the culture of the viewers…lookin’ at you America 🤔</p>


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2019,
  author = {},
  title = {First {TidyTuesday} Submission},
  date = {2019-01-08},
  url = {https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2019" class="csl-entry quarto-appendix-citeas">
<span>“First TidyTuesday Submission.”</span> 2019. January 8, 2019. <a href="https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/">https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/</a>.
</div></div></section></div> ]]></description>
  <category>Data Science</category>
  <category>Education &amp; Community</category>
  <guid>https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/</guid>
  <pubDate>Tue, 08 Jan 2019 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2019-01-08_first-tidytuesday/featured.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>First post!</title>
  <link>https://www.jrwinget.com/blog/2018-02-07_first-post/</link>
  <description><![CDATA[ 




<section id="hello-world" class="level1">
<h1>Hello, world!</h1>
<p>Thanks for taking the time to read my first blog post! While I am trained as a social psychologist, I’m a big proponent of leveraging data science tools to understand the world around us. Thus, I plan to discuss a mixture of data-driven topics and insights here. I hope to use this blog as a place to write about things that interest me, but focus will likely be on statistics, research methods, open science, social psychology, and data science. This will be a place for me to share my thoughts on a variety of topics, offer guidance on statistical/methodological topics for the social sciences, and hopefully generate a bit of discussion.</p>
<p>To tell you a little more about myself, I am currently a fourth year PhD student in applied social psychology and received my MA in applied social psychology from Loyola University Chicago in 2016. My research interests are related to group dynamics, cooperation/conflict, social influence, and quantitative methods. I started conducting my data analysis in R about a year ago, and have been interested in the power of open source data science tools ever since. I believe one way to improve our current scientific system is to create and use good open source tools, and we also need to make these tools easy enough for all scientists to use. In other words, we need to make science more accessible.</p>
<p>I’ll stop myself before I stand too tall on my open-science soapbox (I’ll save that for another post). For the remainder of this post, I want to take some time to explain where the header image on my homepage comes from. Since I’m also a former automotive mechanic, I thought I would pay a bit of homage to my former trade. If you are at all familiar with R (or RStudio), there’s a good chance you’re familiar with the <code>mpg</code> dataset. If you’re not familiar with this dataset, the data are a subset of the fuel economy data the EPA makes available <a href="http://fueleconomy.gov">here</a>. It contains only models which had a new release every year between 1999 and 2008, and this was used as a proxy for the popularity of the car. The <code>mpg</code> dataset is part of the <code>ggplot2</code> package in R, and I used it to create the image of the graph on my homepage.</p>
<section id="the-data" class="level2">
<h2 class="anchored" data-anchor-id="the-data">The data</h2>
<p>First, let’s take a look at the structure of the data:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str</span>(mpg)</span>
<span id="cb1-3"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## tibble [234 × 11] (S3: tbl_df/tbl/data.frame)</span></span>
<span id="cb1-4"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ...</span></span>
<span id="cb1-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ model       : chr [1:234] "a4" "a4" "a4" "a4" ...</span></span>
<span id="cb1-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ displ       : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...</span></span>
<span id="cb1-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ year        : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...</span></span>
<span id="cb1-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ cyl         : int [1:234] 4 4 4 4 6 6 6 4 4 4 ...</span></span>
<span id="cb1-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ trans       : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...</span></span>
<span id="cb1-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ drv         : chr [1:234] "f" "f" "f" "f" ...</span></span>
<span id="cb1-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ cty         : int [1:234] 18 21 20 21 16 18 18 18 16 20 ...</span></span>
<span id="cb1-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ hwy         : int [1:234] 29 29 31 30 26 26 27 26 25 28 ...</span></span>
<span id="cb1-13"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ fl          : chr [1:234] "p" "p" "p" "p" ...</span></span>
<span id="cb1-14"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  $ class       : chr [1:234] "compact" "compact" "compact" "compact" ...</span></span></code></pre></div></div>
</details>
</div>
<p>Among other things, this tells us there are a 234 observations of 11 different variables. Most of these variables are fairly intuitive by their names, but for our purposes, I will only focus on three of them: <code>displ</code>, <code>hwy</code>, <code>cyl</code>. These are the cars’ engine displacement (in litres), highway miles per gallon, and number of cylinders.</p>
</section>
<section id="creating-the-graph" class="level2">
<h2 class="anchored" data-anchor-id="creating-the-graph">Creating the graph</h2>
<p>Here is the code to create the graph using the <code>ggplot2</code> package:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb2-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(mpg, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(displ, hwy)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(cyl))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> lm, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(cyl)))</span>
<span id="cb2-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## `geom_smooth()` using formula = 'y ~ x'</span></span></code></pre></div></div>
</details>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.jrwinget.com/blog/2018-02-07_first-post/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>And, that’s all there is to it! While it may seem pretty straightforward, this actually took me a few days to create when I first started coding in R. Indeed, one’s first <code>ggplot</code> seems like quiet the achievement in the moment!</p>
<p>If you’re scratching your head after reading this post, don’t worry! I plan to blog about plenty of neat things one can do in R (and in much more detail), so stay tuned!</p>


</section>
</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{2018,
  author = {},
  title = {First Post!},
  date = {2018-02-07},
  url = {https://www.jrwinget.com/blog/2018-02-07_first-post/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-2018" class="csl-entry quarto-appendix-citeas">
<span>“First Post!”</span> 2018. February 7, 2018. <a href="https://www.jrwinget.com/blog/2018-02-07_first-post/">https://www.jrwinget.com/blog/2018-02-07_first-post/</a>.
</div></div></section></div> ]]></description>
  <category>Data Science</category>
  <guid>https://www.jrwinget.com/blog/2018-02-07_first-post/</guid>
  <pubDate>Wed, 07 Feb 2018 00:00:00 GMT</pubDate>
  <media:content url="https://www.jrwinget.com/blog/2018-02-07_first-post/featured.png" medium="image" type="image/png" height="81" width="144"/>
</item>
</channel>
</rss>
