Adam C. Conrad
https://www.adamconrad.dev
The website and home of engineering leader Adam C. Conrad.Thu, 27 Feb 2020 23:00:19 +0000Working Remotely When You Didn't Plan To
https://www.adamconrad.dev/blog/working-remotely-when-you-didnt-plan-to/
Thu, 27 Feb 2020 23:00:19 +0000https://www.adamconrad.dev/blog/working-remotely-when-you-didnt-plan-to/2020-02-27T23:00:19+00:00Technical Lead Management 101 or How to Try Out Management
https://www.adamconrad.dev/blog/technical-lead-management/
Thu, 27 Feb 2020 23:00:19 +0000https://www.adamconrad.dev/blog/technical-lead-management/ I wrote this for my tech leads who needed a crash course in engineering management. The advice you see in this article is the advice I used to help level up my team leads.
At some point in your technical career you'll have to make a choice: stay on the individual contributor (IC) track or move into management. Many seem to believe that the latter is the only way to move up in your career.
It's not!
Plenty of people want to (and deserve) to grow their careers as software developers without having to manage others. **Regardless, at some point you'll have to test the waters to see which is right for you.** To ease this transition and minimize the risk of inundating developers with too much change, we've created a hybrid role called _Technical Lead Manager_. In this article, we're going to describe what this role is, why it is needed, and the basics for becoming a technical lead with management responsibilities.
## What is a Technical Lead Manager?
While there is no official definition of this role, in my mind **a Technical Lead Manager (TLM) is a tech lead with minimal engineering management responsibilities.** More specifically, we define "minimal" to mean **no more than 3 direct reports.** If you're brand new to engineering management, a _direct report_ means someone (e.g. a software engineer) who reports to you in your company's organizational hierarchy. In other words, you write code _and_ are the boss of up to 3 people.
### Why not call them a Junior/Associate Engineering Manager?
You could, but the distinction here is that **we expect an engineer with this role to still perform his/her technical duties (i.e. coding, debugging, code review), albeit in a more limited capacity.** Without the _Technical_ adjective, we leave the opportunity to let those technical skills atrophy in certain organizations.
### Why the need to combine both duties? Won't that overwhelm the individual?
At some point in your career, you'll be faced with the decision of considering roles in management. In many organizations, the leap to engineering manager is a large one. Make no mistake: **engineering management is a completely different skillset from software engineering and should be considered hitting the "reset button" on your skills.** With such a stark contrast in roles and responsibilities, the transition to management at the most junior levels could dissuade people from considering this career move in the first place.
Nonetheless, leaders need to groom the next generation of leaders underneath them. To ease the transition and best prepare their future leaders with the necessary skills, consider looking at this hybrid role.
## What does a Technical Lead Manager do?
At its core, a Technical Lead Manager codes _and_ manages the career of up to 3 direct reports. Since management is generally the newer of the two skills, we won't focus too much on how to develop the technical skills. For further reading on this topic, consider [this primer](https://www.getclockwise.com/blog/the-new-tech-leads-survival-guide) if you've never been a tech lead before.
Instead, let's dive into the core skills you're going to have to build from scratch.
### The Four Dials of Engineering Management
When I first became an engineering manager, my CTO gave me this analogy to think of engineering management skills _like knobs on a guitar amp_. For EMs, there are four dials: **people, process, product, and prowess**.
Each dial is turned up somewhere between 0 and 10 depending on the organization you work in. Some companies highly value technical prowess, while other companies see EMs as people managers with a previous technical background. Regardless of your company's organizational design, you're going to want to know what all four knobs are so you can dial in the exact volume of each skill.
#### People
Let me cut right to the chase: **successful engineering management is all about maximizing the effectiveness of your people.** If you take nothing else away from this guide, know that _simply caring about your people will get you 80% of the way there_. To effectively care for your people, this dial itself has 4 core aspects:
##### One-on-ones (O3s)
These are the bread and butter of effective management. **Consider them the most sacred time of your entire week.** O3s build trust, shared understanding, and rapport. From all of the materials I've read on this subject, here's a formula for the most effective O3s:
* Have them every week with all of your directs for 30 minutes.
* If you have a conflict, move your O3 but _never cancel it_.
* Start each O3 with "how are you?" or "what's on your mind?"
* At first, allot all 30 minutes for your directs to say whatever is on their agenda. They may have a lot on their mind in getting to know you so indulge them.
* At some point (around a month in), you'll want to shift it to be 15 minutes for their agenda followed by 15 minutes for your agenda.
* Finally, another month or so in, you'll want to trim a few minutes off of your agenda to bake in coaching & goal setting for your directs' career development.
I still use this formula for all of my O3s and it has served me very well. **Be an [active listener](https://www.youtube.com/watch?v=saXfavo1OQo) during their agenda.** If you're looking for what to ask during your agenda, I've referred to [this list](https://larahogan.me/blog/first-one-on-one-questions/) for seminal O3 questions to ask your direct reports. For coaching and goal setting, keep scrolling. But first, you're going to want to make feedback a central tenant of your agenda.
##### Feedback
People crave feedback. **Being an effective manager means being an astute observer.** There are lots of ways of observing your team without conjuring up an image of the nosey micromanager. You're probably doing a few of these already:
* Active listening in meetings
* Addressing blockers in stand-ups
* Reviewing code
* Pair programming
Plus countless more. The difference now is you're not just observing with the lens of a software engineer but also as a manager with a chance to provide feedback.
**Feedback is about behavior.** Let me repeat myself: feedback is _not_ about changing people but about changing behaviors. "You're bad at your job" is not feedback. It is an ad hominem attack.
Feedback is either about _encouraging a behavior (positive)_ or _stopping a behavior (constructive)_. Feedback, too, has a formula for being given effectively. The one I like to use is based on [nonviolent communication](https://en.wikipedia.org/wiki/Nonviolent_Communication). The formula, like seemingly all of my formulas, occurs in four steps:
1. **Describe the current behavior.** Example: "When you test your code..."
2. **Describe the outcome of that behavior.** Example: "...it increases the reliability of our product."
3. **Describe the effect that behavior has on you or others.** Example: "This instills trust in our customers that we build software that works."
4. **If the feedback is positive, use encouraging language to reinforce the behavior. If the feedback is constructive, describe how you would like the behavior changed in the future.** Example: "Keep it up!"
Stitching this together, our positive feedback example looks like this:
> When you test your code, it increases the reliability of our code. This instills trust in our customers that we build software that works. Keep it up!
Conversely, an example of constructive feedback would look very similar, with the largest difference being the final fourth step:
> When you are late to meetings, it reduces the time we have to get through all of our agenda items. This inconveniences the team and reduces their trust in you to be reliable and timely. Please be on time from now on, thanks!
Study this last example. There are a few things at play here that you should pay attention to:
* The format for both positive and constructive feedback is the same. This conditions your team to see all feedback as a focus on behavior and not a personal attack on who they fundamentally are as human beings. If you keep all feedback formats the same, they won't cower at the thought of receiving negative feedback, and they won't have to worry about whether you're going to praise them or bring them down.
* For constructive feedback, it's about the future, not the past. We can't change the past, so there is no reason to dwell on it. Instead, accept that the undesired behavior happened; don't dwell on it or berate your teammates for their shortcomings. All we can do is stop undesired behavior from happening again by steering the rudder in the right direction.
If you follow these guidelines, your team will think of all feedback, positive or constructive, as a window into effective reinforcement or behavior change.
**Finally, I would recommend you wait to introduce feedback until you've had about 8 O3s (and start with only positive feedback).** You need enough time to build rapport with your directs before you begin to give them feedback. And even then, you want it to start positive. Add on another 8 weeks (16 in total) before introducing constructive feedback so they are used to your communication style.
##### Coaching
Once you are having regular meetings with your directs and giving them regular feedback, it's time to start coaching them in their careers. Coaching can take many forms, some of which I've already mentioned before:
* Pair programming
* Code review
* Mentoring
* Career development plans
I use all of these tools regularly and you should feel free to use all of them. **The point of coaching is to get more out of your directs.**
O3s build trust. Feedback is the transparency that results from that trust. Once you have transparency, you now know what your directs are capable of. To be a next-level manager, you need to strive to extract more value out of your directs by pushing them to be their best selves.
How do you get more out of your directs? **Reinforce their strengths and bring up their weaknesses.**
Do they write amazing tests? Have them review others' code for sufficient test coverage or run a workshop on effective testing strategies.
Conversely, if they constantly forget to test their code, consider introducing a pull request checklist to remind them to include tests and enforce that all code comes with tests to merge code into production.
You're going to have to get creative at this step because there are [boundless ways to coach someone](https://knowyourteam.com/blog/2020/02/19/how-to-coach-employees-ask-these-1-on-1-meeting-questions/) through their career. With all of this new work to do, how are you going to find time to still code?
##### Delegation
The only way to find time for everything is to lean on your directs. **Delegation is the skill of pushing down responsibility to others.** I like to think of this with an analogy:
Imagine the CEO of your company. He or she is likely the only person in the company with the knowledge of the entire ecosystem of how your company operates. They're also likely the only person who can both sell the product to investors and create a grand vision of how your company can wow your customers. The value of these two skills let's say is $500/hour.
Now imagine a paralegal at your company. Paralegals often review documents, perform research, and write legal papers on behalf of the lawyers they work with. Let's say these skills are valued at $35/hour.
Your CEO may have a law background and is quite capable of performing the duties of the paralegal. But if the CEO can do the legal work _and_ raise money from investors, while the paralegal can only do the legal work, why would you have the CEO ever do legal work? It's in the best interest of the company for the CEO to focus solely on the work he/she is uniquely suited to perform.
**In your case, you are uniquely suited to manage and tech lead. If you are doing _anything_ outside of those two things, you are reducing the effectiveness of your position and should delegate that responsibility to others.**
Delegation is the hardest part about being a TLM (or any manager for that matter). You are naturally going to want to do what you're comfortable doing: coding. Fortunately for you, coding is no longer your highest value skill.
**At every possible moment, you should evaluate whether the task you are performing could be done by someone else.** If it easily can, document the work involved and share it with your team to perform. If it cannot, then it's probably something you need to continue to do yourself. This self-evaluation is the basis for the next dial of engineering management: process.
#### Process
If engineering management centers around people, then Process is the mechanism for making the People dial as efficient as possible.
**Process is all about documentation.** If you have a repeatable, verifiable way of doing something, you should document it for your team or others to perform. Fun fact: this post is a manifestation of my process. Teaching others how to be engineering managers is a process. Reading this post is the process by which engineering managers can learn about how to manage for the first time. I know this is all getting pretty meta, so let's focus on a few ways to institute process:
##### Leverage a wiki tool
Tools like Confluence or GitHub Wikis are a great way to codify processes. The question I always ask myself is:
> Do other people need to know this and will forget it? I should write this down in the wiki.
I mean all of the time. Even the most seemingly mundane tasks are documented in our Confluence at work. **Never assume a task or process is too dumb to spell out in your documentation tool.** As we gain experience in the field we assume a lot of things about our work. Not everyone has the same experiences or background as us, so what may seem trivial to you could be quite complex to others.
That's why it's always a safe bet to write it down in the docs.
At worst, you took some time to become a better writer and communicator. At best, your documentation with the whole company and adds value to everyone around you. So why wouldn't you write it down?
##### Use it as a teaching moment
Looking for ways to incorporate coaching into your workday? Describe a process to a peer or direct. Gather feedback on your process: what works? What doesn't? What could be improved? Processes should help make everyone around you more effective. If you can teach something to others, you're building your skills as a coach and adding value to the organization.
##### Simplify your life
The beauty of processes is once it is set in stone, you pretty much never need to think about it again. Processes turn algorithmically intense into the automatic.
For example, if you didn't have a process for running effective meetings, you probably spent a lot of cognitive effort in figuring out how to set the agenda. Once you have a process, though, planning meetings become quite rudimentary and painless. You have a plan for how you're going to get through them.
Processes make your life easier, and your life is already challenging enough.
#### Product
One of the cool parts of moving into management is the cross-disciplinary skills you'll develop. **Get ready to become best friends with your product manager.**
I naturally consider myself to be a product-minded engineer. If this doesn't come naturally to you, it will be thrust upon you whether you like it or not. As the engineering manager, **you are now the liaison to other departments about the technical capabilities of your team.** Think of yourself as the head consultant and advisor to the rest of the company about your team.
What is the makeup of the skills on your team: are they front-end or back-end focused (or full-stack)? Who loves tackling bug tickets and who loves fixing tech debt? How will you balance tech debt and bugs with new features? Who decides whether a bug is more important than a feature? These and countless other questions will arise when you turn up the Product dial.
**Your best defense in dialing up Product is to be on the offensive, and that means getting to know your product and your product managers.** Just like your directs, set up O3s with your product manager. Get to know them and what makes them tick. Build trust. The only way you can build great software without you and your PM at each other's throats is to create an authentic relationship.
**There will usually be a healthy tension between product and engineering and that's okay.** Sometime you'll want to address mounting tech debt in your end-to-end tests. Meanwhile, your PM will wonder why Widget X wasn't delivered yesterday. You will each have reasonable justifications for why your stance is correct. Guess what? At the end of the day, you're going to have to come to a consensus. **See my earlier points about active listening, nonviolent communication, and building trust.**
Oh, and by the way, this extends far beyond Product. You'll have conflicts to resolve with architecture, design, marketing, sales, legal, and virtually anyone who comes in contact with your team: _including the team itself_. One engineer will say that Vim is the best IDE. Another will say it's Emacs. Spaces vs tabs. Linux vs Mac. It's pronounced "gif" (no it's "jif"). You name it, someone will have an opposing opinion.
**You are mediator first, moderator second, and arbiter last.**
What I mean by this is in any conflict, your first job is to _help each side empathize with the other_. Empathy, often confused with sympathy (which is about feeling someone else's distress), is about walking a mile in another person's shoes. Can you get Person A to see where Person B is coming from, and vice versa? If not, move into the moderation phase.
Moderation is about guiding the discussion to help the two parties address their concerns and air their grievances. This guidance should hopefully lead to automatic mediation: by moderating the discussion, you're allowing both parties to see where the other is coming from in a way you couldn't immediately achieve in your first pass. But what if consensus should not be reached?
As a last resort, it's time to be the judge, the jury, and the executioner. Ideally, people come to their conclusions themselves. The advice I've always latched onto is
> Your best ideas seem like they never came from you at all and instead came from the person you were trying to convince in the first place
Sometimes you don't have that luxury. In that case, you have to make a call for others. An often-quoted maxim of the military is **a poor decision is generally better than indecision.** Dwelling on whether you are making the right call in this argument and coming up with nothing is worse than just picking one side.
This Product dial morphed into conflict resolution but is an important aspect of the natural tension that will arise when you are no longer interacting solely within the discipline of engineering.
#### Prowess
Finally, we can't forget what got you here in the first place: your technical prowess. For TLMs, coding, tech leadership, and review should still be anywhere from 50 to 80% of your duties depending on the organization you are a part of. In my first formal engineering management role, I also played the role of a tech lead and spent nearly 80% of my time coding. The further up the food chain you go, the less you will be coding. **Do whatever you can to keep that number above 0%, even if that means coding after work.**
Why? **No one wants to work for someone who can't remember what it's like to be in the trenches.** The engineering managers I've respected the most are those who can still tango with the code. Like the CEO and paralegal analogy from above, you _could_ code, but you're choosing to delegate some of those tasks because a portion of your time is better spent managing.
I find that my technical prowess is best utilized through code review and coaching. Writing out a folder structure for React components is suboptimal compared to teaching a room full of engineers on how to organize their React projects.
**You only have two hands. Every teaching opportunity is a chance for you to give yourself more hands.**
**Systems Design becomes the primary skill** you'll be building in your technical dial. Now that you can't spend all of your time programming, you need to be thinking at a higher level as to how your systems will connect to operate in an efficient, scalable, and reliable manner.
[I love this Systems Design playlist on YouTube](https://www.youtube.com/watch?v=quLrc3PbuIw&list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX). Not only is it a good source of Systems Design interviews to implement in your hiring program, but it will build your skills when you inevitably interview somewhere else and want to demonstrate you know how to logically think about systems. That playlist will help you get the reps in and build this skill quickly. Do yourself a favor and watch all of the videos on this list.
### Other duties of the TLM you should know about
If you stick to mastering the Four Dials of Engineering Management and maintain your skills as a tech lead, you'll find the Technical Lead Manager role to be very rewarding and easy to transition into. Depending on your companies needs, there are a few other duties I want to call out that you should be aware of that EMs also do so you aren't blindsided:
* **Hiring.** When the money is rolling in, you're inevitably going to be growing your team. You're not only going to be on the other side of the interview, but you may not be interviewing for just technical skills. Oftentimes engineering managers assess things like culture fit and soft skills. I'll save this one for another post, but for now, I would just Google how to run a cultural interview because there are plenty of dos and don'ts you need to consider to keep your interview above board.
* **Firing.** With great power comes great responsibility. As much fun as it is to bring in great people, it is equally crushing to let go of underperformers. Performance management is all about retaining great talent and coaching your direct reports to be the best they can be. Sometimes that's just not enough. The key here is to **document everything and ensure you have a strong case for why this person cannot be saved.** This cannot be a gut feeling. You need plenty of documentation of continued underperformance and failure to adopt feedback before you should consider things like PIPs (Performance Improvement Plans). When in doubt, check with your HR or People team on how to handle these scenarios.
* **Performance Reviews, Compensation, and Promotion.** If you have great people and you are following the Four Dials of Engineering Management to get the most out of your team, you're putting them on the path to upskilling them. They will recognize their metamorphosis and demand more pay or a title bump. Like the previous point, these merits cannot be based on gut feelings alone. Instead, _help your directs build an ironclad case for why they should be recognized_. Someone will ask you to justify why your direct deserves a promotion and a bonus. If you can delegate the rationale to your direct, not only will they be able to justify their raise, but so will you.
## Conclusion and Further Reading
This post is long enough (and I'm tired enough) that you should be well on your way to a successful trial at engineering management. I want to thank [this post](https://randsinrepose.com/archives/meritocracy-trailing-indicator/) for inspiring me to consider Technical Lead Managers as a great bridge to delegating management responsibilities to your team.
All of these tools and strategies were tried and tested by me over the last several years as an engineering manager. All of them are backed up by the amazing engineering management books, podcasts, and blogs I've read over the years. An exhaustive reading/listening list can be found [here](https://leadership-library.dev), though I'd like to call out a few pieces I've found especially useful over the years:
* _The Effective Manager_ by Mark Horstman (and the accompany _Manager Tools_ podcast)
* _Resilient Management_ by Lara Hogan
* _High Output Management_ by Andy Grove
* _The 27 Challenges Managers Face_ by Bruce Tulgan (particularly for conflict resolution)
* _Accelerate_ by Nicole Forsgren, Jez Humble, and Gene Kim
* _An Elegant Puzzle_ by Will Larson
* _Extreme Ownership_ by Jocko Willink (and his _Jocko Podcast_)
* _The Making of a Manager_ by Julie Zhuo
* _The Manager's Path_ by Camille Fournier
* _The Talent Code_ by Daniel Coyle
I've read all of these books and are the basis for how this guide was built. If you want to learn more about anything discussed in this guide, I would encourage you to read the source material above. Let me know what you found most helpful or if you'd like clarity on any of this [on Twitter](https://twitter.com/theadamconrad) and I'd be happy to walk you through any of the finer points.
Best of luck and remember that **management is a skill just like anything else**. With a combination of study and practice, you'll be well on your way to greatness!
]]>2020-02-27T23:00:19+00:00Running Effective Planning Meetings
https://www.adamconrad.dev/blog/running-effective-planning-meetings/
Mon, 04 Nov 2019 13:17:19 +0000https://www.adamconrad.dev/blog/running-effective-planning-meetings/2019-11-04T13:17:19+00:00Ending Our Algorithmic Journey by Proving NP-Hardness
https://www.adamconrad.dev/blog/np-completeness/
Thu, 22 Nov 2018 20:17:00 +0000https://www.adamconrad.dev/blog/np-completeness/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/hard-reductions/), check that out first.
## Answers to the previous Daily Problem
> Show that the following problem is NP-complete: for a graph `G` with integers `k` and `y`, does `G` contain a dense subgraph with exactly `k` vertices and at least `y` edges?
When I look at this problem I instantly think of the tree of reductions that we went over the last few articles. So right away I'm thinking of the graph-specific configurations: independent set, vertex cover, and clique. The key in choosing these three is in the word _dense_...which of the graph types in the reductions list deals with dense configurations? My first instinct is clique because of how we want every vertex to touch every other node (think back to the example of the pentagram in the pentagon).
So if we can reduce a dense subgraph into a clique via a reduction, and we know that a clique is part of the NP-complete hardness tree, then we have our solution.
A clique is a fully-connected graph. So if the dense subgraph is a subset of the connections in a clique, your dense subgraph generalizes to clique and therefore the graph is also NP-complete. This is called a [K-clique densest subgraph](https://en.wikipedia.org/wiki/Dense_subgraph) if you want to know more about the formal proof.
## Nearing the end: tips for proving hardness
We've come a _long_ way in our journey in understanding data structures and algorithms. At this point we're entering territory where programming languages no longer matter; these theoretical algorithms have no real solutions so we can't map them to code samples. Whether you're studying this for fun or for interviews you may reach this final stage where you need to understand how to solve for uncharted territory. Here's a few tips on proving algorithmic hardness:
### Choose a good, simple NP-hard graph
The first and most important thing is to identify the right kind of NP-complete problem. Knowing the chart of problem translations is extremely handy in these situations. Here's a few based on how much you think you can fit in your head:
* [A simple one](https://upload.wikimedia.org/wikipedia/commons/8/89/Relative_NPC_chart.svg)
* [A bigger graph](https://www.edwardtufte.com/bboard/images/0003Nw-8838.png)
* [The motherload of current hard problems](https://adriann.github.io/npc/dot.png)
If you look at the Daily Problem, much of the effort came down to choosing the right source problem. Clique was clearly the right choice for K-clique densest subgraph. But even then, you could still find a reduction using vertex cover or 3-SAT, it would just be harder to make the translation from the source to the destination problem. For most of these problems, you pretty much only ever need to **start with 3-SAT, vertex cover, Hamiltonian path, or integer partition**:
* Worried about order or scheduling like a calendar? Start with _Hamiltonian path_.
* Dealing with selection and filtering? Start with _vertex cover_, though _clique_ and _independent set_ are obvious offshoots if you can see the distinction.
* How about combinatorics or large numbers? _Integer partition_ is what you want.
* For everything else? _3-SAT_ is at the top of the NP-complete food chain and is extremely versatile.
### Constrain the problem as much as possible
Constraints are good. An undirected graph becomes a directed graph. An unweighted graph adds weights. Or we can flip it: we did this when reducing TSP into a Hamiltonian cycle by turning weighted edges into unweighted ones, equalizing the cost of each edge for a Hamiltonian cycle.
Your constraints should inject harsh penalties for deviating away from your reduction. When we reduced Integer programming into 3-SAT, the maximization function and target value all became inconsequential. To ensure we didn't let that get in our way, we had to neutralize the maximization function to the identity function and set the target to `0`. Any other set of values would make it more difficult to prove.
Once you have the constraints, _force a decision_. Make it so that the choice is between A or B and never A _and_ B.
### Reductions are easier than algorithms but keep your options open
When your head is in the NP-complete or NP-hard space you automatically assume that the problem is unsolveable with an algorithm. However, non-polynomial problems in the exponential space can still be solved with techniques like dynamic programming to bring it back down into efficient polynomial space.
In other cases, you can use approximation algorithms with heuristics to turn NP-hard problems into polynomial-time solutions. Maximum clique, minimum vertex cover, and the Knapsack problem all benefit from approximation algorithms to arrive at a relatively correct but efficient result.
## Final thoughts
This project of auditing the class for _The Algorithm Design Manual_ was truly a labor of love. Learning by teaching is a great way to really cement your understanding of the materials in this series.
If you used this to help you understand data structures and algorithms for an upcoming interview, I want to leave you with a selection of problems over the whole book as sort of a final exam to test your knowledge. These problems are specifically designed for those interviewing at companies, so if you can tackle these, you'll be in a great spot. Good luck and thanks for sticking with me!
1. Problem 1-29.
2. Problem 1-31.
3. Problem 1-33.
4. Problem 2-43.
5. Problem 2-45.
6. Problem 2-47.
7. Problem 2-49.
8. Problem 2-51.
9. Problem 3-19.
10. Problem 3-21.
11. Problem 3-23.
12. Problem 3-25.
13. Problem 3-27.
14. Problem 3-29.
15. Problem 4-41.
16. Problem 4-43.
17. Problem 4-45.
18. Problem 5-31.
19. Problem 7-15.
20. Problem 7-17.
21. Problem 7-19.
22. Problem 8-25.
]]>2018-11-22T20:17:00+00:00Harder Intractable Reductions
https://www.adamconrad.dev/blog/hard-reductions/
Thu, 22 Nov 2018 20:17:00 +0000https://www.adamconrad.dev/blog/hard-reductions/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/easy-reductions/), check that out first.
## Answers to the previous Daily Problem
> Suppose we are given a subroutine which can solve the traveling salesman decision problem in, say, linear time. Give an efficient algorithm to find the actual TSP tour by making a polynomial number of calls to this subroutine.
First we need to find the length `l` of a shortest TSP tour. Then we just go over every edge on the graph and see if, when we remove that edge, our shortest TSP tour is still `l`. If it is, that edge wasn't part of the path so we remove the edge completely. The end result is a graph of only edges that build the `l` length of that shortest path. We only made a linear number of calls for our TSP decider since we just went over a linear number of edges in the graph.
## Two examples of harder reductions
Once again, a reduction is simply a way of turning one concept into another; _reducing_ it into a simpler/different problem. For more difficult reductions, we don't have the straightforward algorithms that we had from our previous post and will require a bit more creativity.
### 1. Integer Programming & 3-SAT
I have to be honest, the rules for integer programming seem oddly arbitrary and kind of contrived. Integer programming in the context of combinatorial optimization has to obey 4 rules:
1. Variables are integers
2. You've got some inequality constraints over those variables
3. A maximization function over some (or all) of those variables
4. An integer target you need to hit
Again, it's best to use JavaScript to provide a concrete example:
```js
const sampleIntegerProgramming = (a, b) => {
// constraint 1
if (a < 1) return false;
// constraint 2
if (b < 0) return false;
// constraint 3
if (a + b > 3) return false;
const maximizationFn = b * 2;
const target = 3;
return maximizationFn >= target;
}
sampleIntegerProgramming(0, 0); // false - invalid constraint 1
sampleIntegerProgramming(2, 2); // false - invalid constraint 3
sampleIntegerProgramming(1, 0); // false - maximization misses target
sampleIntegerProgramming(1, 2); // true - all rules satisfied
```
As you might expect, you can reach impossible situations. If you just change your `target` to a value of `5`, there is no possible set of integer arguments that would return a value of `true`.
We can reduce 3-SAT into integer programming; given we already know 3-SAT is hard, and we want to prove integer programming is hard, our translation will start with the 3-SAT problem. There aren't as many silly rules for 3-SAT, but we can translate it pretty clearly:
1. Variables are integers, restricted to `0` (false) and `1` (true).
2. Inequality constraints, based on rule 1, are that all variables (and their negations) are between `0` and `1`. Further, to ensure truthiness for every expression, we need to make sure all variables are less than or equal to `1` when summed with its negation (which must be greater or equal to `1`).
3. The maximization function is inconsequential as we've already mapped out all of the rules for 3-SAT, so we can keep it simple and just map this function to the identity function of one of our variables.
4. The target value also doesn't matter so to make sure it never matters we set it to something universally achievable like `0`.
With these four rules for IP described in the veil of 3-SAT, we've shown our creative translation. We know every 3-SAT can solve an IP because, under these 4 rules, every clause in 3-SAT must have an answer of `1` (true). The inequality constraints sum to ensuring we just have an answer greater than or equal to `1`. With a target of `0` we satisfy our target always since no 3-SAT problem can offer a result of false (`0`).
Flipping it around to have any IP problem provide a solution to a 3-SAT problem, we can say that we know all integer variables can only be `1` and `0`. Any `1` is a `true` in the 3-SAT problem. Any `0` is a `false` in the 3-SAT problem. These assignments simply need to satisfy all expressions in a 3-SAT problem to ensure its validity.
And just like that, we prove the reduction works both ways and shows that integer programming is a NP-hard problem just like 3-SAT.
### 2. Vertex Cover & 3-SAT
We saw the definition of vertex cover in the [previous article](/blog/easy-reductions/) and it, too, can be reduced via 3-SAT.
1. The boolean variables in 3-SAT are now two vertices (one for your variable's value, one for the negated value) in your vertex cover graph connected by an edge. We'll need `n` vertices to ensure all edges are covered since no two edges can share a vertex.
2. Each expression in 3-SAT now needs three vertices (one for each variable in a 3-SAT expression). Each of those vertices form a triangle since those variables are linked in a given expression. That gives us a number of triangles equal to the number of expressions in our 3-SAT problem. Two of the three vertices in a triangle are needed to make a vertex cover, so you're looking at `2e` number of vertices where `e` is the number of expressions in your 3-SAT problem.
3. Construct a graph with `2n + 3e` number of vertices, joining the graph made in step 1 and the graph made in step 2. What this graph looks like is a wall of variables connected to their negations at their top, and an edge connecting them to their equivalent placement in their 3-SAT triangles.
And with that, we translated 3-SAT into vertex cover. Think of this like a pie chart with a legend. Each triangle makes up the segments of the pie with a different color. Each color points to a value on the legend, saying what that color represents (either the variable or the negated variable).
To show this translation is correct, we show, just like the previous example, that one problem solves the other (and vice versa).
Every 3-SAT gives a vertex cover because when we have a set of truthful assignments for our expressions, we can find those variables in the "legend" within our vertex graph. Since those assigments are true, we know we are _covering_ one of the triangles. All of the other cross edges are covered by virtue of the fact that our "pie graph" is a triangle. The triangle configuration ensures coverability in one main portion, and the truthiness of our variables ensures we connect the triangles to the legend portion from step 1.
Conversely, every vertex cover gives a 3-SAT because in a graph of size `n + 2e` (the necessary size of a vertex cover graph), we just need to account for all vertices. Per our earlier definition, we need `n` vertices to represent our legend of possible variables. These vertices can define our 3-SAT truth assignment. The remaining `2e` vertices have to be spread across each variable _and_ it's negation. In other words, we need to ensure that for a given 3-SAT problem, either you're connecting your cover to the variable used in the expression (e.g. `v1`) or its negation (e.g. `!v2`). If you have a vertex cover, you've got at least one cross edge covered per triangle (representing the 3-SAT expression). In other words, you're satisfying all expressions with a connection, or an evalution of `true` in 3-SAT for that particular expression.
That's a mouthful, but we can go either way which means we have a reduction from 3-SAT to vertex cover! And with that, our *final* daily problem:
## Onto the next Daily Problem
> Show that the following problem is NP-complete: for a dense subgraph `G` with integers `k` and `y`, does `G` contain a subgraph with exactly `k` vertices and at least `y` edges?
## More problems to practice on
For harder, more creative reduction problems, check out Problem 9-13 from the homework set.
]]>2018-11-22T20:17:00+00:00Easy-ish Intractable Reductions
https://www.adamconrad.dev/blog/easy-reductions/
Thu, 22 Nov 2018 20:17:00 +0000https://www.adamconrad.dev/blog/easy-reductions/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/reductions/), check that out first.
## Answers to the previous Daily Problem
> We wish to compute the laziest way to dial given `n`-digit number on a standard push-button telephone using two fingers. We assume that the two fingers start out on the `*` and `#` keys, and that the effort required to move a finger from one button to another is proportional to the Euclidean distance between them. Design and analyze an algorithm that computes in time `O(n)` the method of dialing that involves moving your fingers the smallest amount of total distance.
And you thought you could get rid of dynamic programming, didn't you? So we've got 10 numbers on the keypad plus the special characters. In other words, you're asking yourself "how many phone numbers can I construct with two fingers?"
This breaks down into a bunch of subset of problems like "how many 10-digit numbers can I construct with my fingers on `*` and `#`?" which is a superset of the configuration where you slide your left finger up by 1 unit and pressed `7`, or "how many 9-digit numbers can I construct with my fingers on `7` and `#`". That is a superset of the configuration where you slide your left finger up by 1 again and pressed `4`, or "how many 8-digit numbers can I construct with my fingers on `4` and `#`." I think you see where this is going.
To drill this into your brains: the key here is that we see this repetition and _use the previous results via a cache to speed up our program_. Thus, all of those 10-digit phone numbers that start with `7`, we remember we store "move up 1" as a way to construct a phone number starting with `7`. We never have to worry about figuring that out again for all other numbers starting with `7`.
Because we're saving time by increasing the space, we reduce this algorithm down from something exponential to something linear. The algorithm, at a high level, looks like this:
1. Set a default value of `#` and `*` for your starting points. For a 1-digit or 2-digit number, the cost is simply the distance from that key to the new number for one finger (or two fingers).
2. For the remaining digits with a number range from `0` to `9`, store the cost in a 3D matrix `mtx[digit][fromLeft][fromRight]` where the first axis `digit` is the number of digits into the sequence you're in and `fromLeft` (or `fromRight`) is the successive score coming from a certain key on the keyboard given where your finger was previously placed.
3. Return the score of the matrix `mtx` for the `n` digits specified starting with the `*` and `#` keys.
Since we're filling one cell at a time in the matrix, this is just a simple run of our `n` digits across a matrix of space `100n`, since there are `10n` cells for each finger. We can reduce `O(100n)` down to `O(n)`, which is a linear-time algorithm.
## Three examples of easier reductions
To reiterate, a reduction is simply a way of turning one concept into another; _reducing_ it into a simpler/different problem. Here are a few more examples that we can extend upon from last time.
### 1. Hamiltonian Cycle & TSP
We all know the Traveling Salesman Problem: getting from one point to another as fast as possible. The Hamiltonian cycle is a similar problem with two major distinctions:
1. The graph is unweighted (i.e. in TSP every vertex has a cost, but in HC those costs are all the same)
2. The goal is to tour all elements exactly once. TSP also visits every edge (to find the best path), but that isn't the goal
Due to their similarities, we can express a Hamiltonian cycle as a reduction of a Traveling Salesman Problem:
1. First, make a copy of your Hamiltonian graph such that all vertices have the same weight
2. For every vertex against every vertex (an `n^2` loop), if the combo of these two edges expresses a vertex in your graph, set the weight of that vertex to be `1`. Otherwise, set the weight to `2`.
3. Use this newly-weighted Graph to decide if there is an answer to the TSP across all `n` of these vertices.
We know this works because if we find a Hamiltonian cycle on this graph, we can trace a TSP solution that has translated across all valid edges. Since each edge is of weight `1` and we have to explore all `n` vertices, the cost of the TSP tour has to be `n`. Reducing that back into the Hamiltonian cycle graph which has no weights, it would only use the edges of weight `1` which is the cycle we're aiming to find.
### 2. Vertex Cover & Independent Set
Still looking at graphs we see two more similar problems revolving around subsets of vertices in a graph.
A _vertex cover_ is a subset of vertices in a graph that touch all edges. Conversely, an _independent set_ is a subset of vertices in a graph that don't touch each over via an edge. For certain graphs, coloring in the vertices of vertex cover look like the inverse of what an independent set would be.
So, to handle the reduction, thus proving that both problems are just as hard to tackle, we've got a pretty straightforward algorithm:
1. Make a copy of the vertex cover graph
2. Make a copy of the number of vertices in your vertex cover subset
3. Find an answer to the independent set graph using your copied graph and vertex subset
The important thing here is we transform the inputs, not the solution. In fact, we don't even need to know what the solution is. Simply by translating the inputs into independent set problem we are able to solve for the hardness of vertex cover.
Oh, and to make things more fun: we reduced TSP into a Hamiltonian cycle, and a Hamiltonian cycle can be reduced into a vertex cover, just like an independent set.
### 3. Clique & Independent Set
For our last examples we look at cliques in graphs. Yes, like social cliques of friends, graphs can form a clique which is to say that a graph where a vertex has an edge to every other vertex in some kind of subset of vertices.
An example of a five-vertex clique would be the [pentagram](https://en.wikipedia.org/wiki/Pentagram#/media/File:Regular_star_polygon_5-2.svg) formed within a pentagon. Each corner of the pentagon forms a shape, connected the adjacent vertices. To solidify the clique, you now create edges to all of the other vertices that aren't part of the pentagon's outer shape. And what do you know, a clique is just a reduction of an independent set.
1. Copy your independent set graph, except the edges in this copy are the inverse of the edges in your independent set. In other words, if the edge `(i,j)` is not in your original graph, add that edge to your copy.
2. Find an answer to the clique graph with the same vertex subset and your new graph
As we can see here, we can imply the hardness of clique by the hardness of independent set, and thus the hardness of vertex cover. Where does this chain of hardness reductions end? That's where satisfiability comes in.
### Satisfiability: the one reduction to rule them all
NP-completeness derives all the way down to satisfiability. Any hard problem can be reduced down into a series of graph constraints like the three we saw above. Those problems all eventually reduce down to satisfiability. We can _satisfy_ a statement by saying at least 1 truth exists for every expression in a program. This is super easy to describe with JavaScript:
```js
let v1, v2;
v1 = true; v2 = true; // satisfied
v1 = false; v2 = false; // satisfied
v1 = true; v2 = false; // not satisfied
const expr1 = v1 && !v2;
const expr2 = !v1 && v2;
return expr1 && expr2; // if this returns true, we're satisfied
```
In the above example we can find configurations for `v1` and `v2` that satisfy the final boolean expression. Certain problems, no matter how hard you try, cannot be satisfied:
```js
let v1, v2;
v1 = true;
v2 = true;
const expr1 = v1 && v2;
const expr2 = v1 && !v2;
const expr3 = !v1;
return expr1 && expr2 && expr3; // never satisfied
```
Go ahead and try the above code in your console. No matter how you arrange `v1` and `v2`, you will never get your return statement to return a value of `true`.
One of the elementary reductions from satisfiability (the parent to vertex cover) is 3-Satisfiability, or 3-SAT. This is just a specialized form of the above but every expression must contain _exactly 3 variables_. And since we know satisfiability is hard, we know 3-SAT is hard, too. This can extend for all numbers larger than 3, so a statement made up of four-variable expressions is called 4-SAT, a statement with all five-variable expressions is called 5-SAT, and so on.
Why does this only go up and not down in numbers?
With one variable in 1-SAT, if you set that variable to `true` you have a trivial problem. For 2-SAT, you can solve it in [linear time with depth-first search](https://en.wikipedia.org/wiki/2-satisfiability), a polynomial-time algorithm and thus not in the real of non-polynomial hard problems.
From the _n_-SAT problems derives vertex cover and all of the other previous elementary reductions, so we'll stop there.
## Onto the next Daily Problem
> Suppose we are given a subroutine which can solve the traveling salesman decision problem in, say, linear time. Give an efficient algorithm to find the actual TSP tour by making a polynomial number of calls to this subroutine.
## More problems to practice on
For easier reduction problems, check out these from the homework set:
1. Problem 9-8.
2. Problem 9-10.
2. Problem 9-11.
2. Problem 9-12.
]]>2018-11-22T20:17:00+00:00Introduction to NP-Completeness and Intractable Problems
https://www.adamconrad.dev/blog/reductions/
Thu, 22 Nov 2018 20:17:00 +0000https://www.adamconrad.dev/blog/reductions/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/dynamic-programming-part-3/), check that out first.
## Answers to the previous Daily Problem
> Eggs break when dropped from great enough height. Specifically, there must be a floor `f` in any sufficiently tall building such that an egg dropped from the `f`th floor breaks, but one dropped from the `(f − 1)`st floor will not. If the egg always breaks, then `f = 1`. If the egg never breaks, then `f = n + 1`.
> You seek to find the critical floor `f` using an `n`-story building. The only operation you can perform is to drop an egg off some floor and see what happens. You start out with `k` eggs, and seek to drop eggs as few times as possible. Broken eggs cannot be reused. Let `E(k,n)` be the minimum number of egg droppings that will always suffice.
> (a) Show that `E(1,n) = n`.
> (b) Show that `E(k,n) = Θ(n^1/k)`.
> (c) Find a recurrence for `E(k,n)`. What is the running time of the dynamic program to find `E(k,n)`?
Before we completely leave dynamic programming, let's keep it fresh in our minds and think about how we can break this problem down.
### Answer to part A
To solve part A, we can intuit very simply that one egg requires starting at the bottom floor and, at worst, having to travel all of the way up each and every floor to prove what floor the egg breaks (if at all). Thus, 1 egg requires at most `n` floors.
### Answer to part B
To solve part B, we're leveraging binary search: as we reduce our eggs by `k-1`, we cut our floors in half which is a cost of `n/2^k-1`.
### Answer to part C
Finally, to solve part C, let's think of how we would solve this instinctually. If we have 100 floors and 5 eggs, it might initially make sense to just break the building up into 20 equal sections of floors. You may not get an exact match with such limited eggs but you know you'll be able to cover the building with all of your eggs at your disposal.
Diving in, let's say that section 1 (floors 1 through 20 in our example) doesn't break the egg, but section 2 (floors 21 and 40) does. We now have `k-1` eggs to explore a subproblem of this greater problem, because now we know our bounds have been reduced from 1 to 100 down to 21 to 40. You repeat this process, continuing to subdivide sections with your remaining eggs.
Generalizing this problem, we can say that, in the worst case, it will take us `n/s + s` attempts where `s` is the sections divided across the total number of floors `n`, plus the throw from the top section having the egg break at the top floor. What is the minimum section size we can use to reduce the number of hops?
Because we said min (or max) in that last question, this should harken back to [differential calculus](https://math.stackexchange.com/questions/321819/differential-problem-find-the-maximum-and-minimum-value). So if `f(s) = s + n/s`, then our derivative function is `f'(s) = 1 - n/s^2`, set to zero is `0 = 1 - n/s^2`. Rearranging this, we can say that `1 = n/s^2`, and therefore our section size is `s = sqrt(n)`.
To check for if this equation solves the minimum or the maximum, we now need to take the 2nd derivate of that equation, which is `f''(s) = 2n/s^3`. The rule states that if `s` is positive, we have `f''(s) > 0` and we have a minimum. Since you can't have negative number of sections for these floors, we now know we've reached our minimum floor section size. If that doesn't make sense, check out that Math Stackexchange link for a deeper explanation of how that differential equation works.
Going back to our original equation, since we know that a section is expressed as a fraction of `n`, we can then turn our equation into using only 1 variable, which means that `n/s + s` is now `n/sqrt(n) + sqrt(n)`, `n/sqrt(n)` is just `sqrt(n)`, so now it's `sqrt(n) + sqrt(n)` or `2sqrt(n)` is the maximum number of tries we need to find our solution.
All of that said, this means `E(k,n) = min( max(E(k,n-x), E(k-1,x-1)) + 1)`. What this is saying is that if you drop an egg and it doesn't break, we know the floors must be above where you're at and you still have your egg intact. The first function in the `max` equation is saying you would run this function with `n-x` remaining floors and the same number of `k` eggs. Conversely, if the egg breaks, we know the floors have to be below your current floor `x`, so that second `max` equation is saying you would run this function `E` with one less egg and the max floor being `x-1`. We need to take the worst-case scenario here which is why we are calling `max` over these two possibilities. We add in `+1` because you need to consider the cost of throwing the egg, which is a constant cost. Finally, for all possible `x` values to try against, we want to minimize the cost of these tries, which is why we wrap the _whole thing_ in a `min` function.
To compute the runtime of this dynamic programming algorithm, we know that `E(k, n)` runs across all `k` eggs and all floors `n` on a minimization function summing across all values of `n`, which reduces down to `O(kn^2)`.
## Reductions: turning one problem into another
If I had to summarize this section, it would be that *reductions are ways of expressing that two problems are identical.* If you can find a fast solution for problem A, it means, through transformation, you can find a fast solution for problem B.
### Examples
#### Closest pair of integers
One simple exmaple is finding the closest pair of numbers in an array. The naive implementation says find the difference across all numbers which is `O(n^2)`. The translation is to _sort the array first_ since the closest pairs will be adjacent to each other and therefore we can run the difference comparison in a linear fashion, bringing the complexity down to `O(nlog(n))`.
#### Least common subsequence
Another example we can utilize from our [dynamic programming section](/blog/2018-11-06-dynamic-programming-part-2/) involves the Longest Common Subsequence. We started that section off by talking about Edit Distance (i.e. spell checking resolves to either adding/removing/replacing/keeping a letter in a word). In many ways, LCS is just a _translation_ of Edit Distance: first we sort the numbers (again, by sorting we ensure the sequence is ever increasing), set `insertion` or `deletion` of numbers (to string together the longest sequence) to a cost of 1, set the `substitution` cost to infinity (since we don't allow substitutions in this problem), then simply subtract from our set the recursive call of the edit distance divided by two to get the lenght of the LCS. Why two? Every addition or subtraction has to happen as a pair; so the actual distance is about a "move" operation which doesn't affect the length of the array.
#### Least common multiple & greatest common divisor
Finally, a really common pairing of problems is in dealing with pairs of integers: finding their least common multiple (e.g. 24 and 36 has a LCM of 72) and greatest common divisor (e.g. for those same numbers the GCD is 12).
In fact, we can use one to derive the other: if we take two numbers, say `x` and `y`, and multiply them together and divide over their greatest common denominator, you get the least common multiple! Just like our previous two examples where sorting _translates_ a hard problem into an easier one, finding the GCD _translates_ the LCM problem from a intractable problem to a very easy one. Normally you'd need the prime factorization which cannot be done efficiently. But by deriving the LCD using the [Euclidean algorithm](https://en.wikipedia.org/wiki/Euclidean_algorithm) we can turn this into an extremely efficient `O(n)` algorithm!
We'll persue harder examples in the following posts, but for now let's move on to the next Daily Problem:
## Onto the next Daily Problem
> We wish to compute the laziest way to dial given `n`-digit number on a standard push-button telephone using two fingers. We assume that the two fingers start out on the `*` and `#` keys, and that the effort required to move a finger from one button to another is proportional to the Euclidean distance between them. Design and analyze an algorithm that computes in time `O(n)` the method of dialing that involves moving your fingers the smallest amount of total distance.
## More problems to practice on
The [last homework set for the class](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw5.pdf) also covers chapter 9 on reductions as well, so here are your introductory problems into this space:
1. Problem 9-1.
2. Problem 9-2.
]]>2018-11-22T20:17:00+00:00Dynamic Programming in JavaScript Part 3 - Limitations
https://www.adamconrad.dev/blog/dynamic-programming-part-3/
Thu, 08 Nov 2018 23:30:00 +0000https://www.adamconrad.dev/blog/dynamic-programming-part-3/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/dynamic-programming-part-2/), check that out first.
## Answers to the previous Daily Problem
> A certain string processing language allows the programmer to break a string into two pieces. It costs `n` units of time to break a string of `n` characters into two pieces, since this involves copying the old string. A programmer wants to break a string into many pieces, and the order in which the breaks are made can affect the total amount of time used. For example, suppose we wish to break a 20-character string after characters 3, 8, and 10. If the breaks are made in left-right order, then the first break costs 20 units of time, the second break costs 17 units of time, and the third break costs 12 units of time, for a total of 49 steps. If the breaks are made in right-left order, the first break costs 20 units of time, the second break costs 10 units of time, and the third break costs 8 units of time, for a total of only 38 steps. Give a dynamic programming algorithm that takes a list of character positions after which to break and determines the cheapest break cost in `O(n^3)` time.
As always, we're going to approach this with the three step process outlined before for solving dynamic programming problems:
1. Describe the problem recursively
2. Ensure parameters are polynomial and not exponential
3. If so, we can store partial results along the way to be efficient
Can we describe this recursively? We know we want to compute something along the lines of `breakCost(str, start, end)` which says "find the minimum break cost of string `str` starting at position `start` and ending at position `end`".
Taking a step back, we know each split costs `n`. With `n` characters we can possibly break after to exhaustively search for costs, we know that is now costing us `O(n^2)` because we need to create a partial solution matrix that fills in the cost for all possible breaks in all possible positions. Just like partitioning books or comparing words for edit distance, the cubic time comes in after we've created our matrix value and we're now calculating the cost of our cached value of our previous partition with the next break and finding the minimum cost of the two _for all positions from our start position to the end position_ in our `breakCost` function, which is also of some fraction of length `n`.
So given we've had so many examples of dynamic programming working, when doesn't it work?
## Algorithmic limitations of dynamic programming
In general we must use algorithms that are both correct and efficient. What conditions cause dynamic programming to fail under these circumstances?
### In correctness
**Dynamic programming fails correctness if the recurrence relation cannot be expressed.** Imagine we wanted to find the most _expensive_ path from point A to be B without visiting a location more than once. Can we do it with dynamic programming?
One problem is we can never enforce the _visiting more than once_ part. When you create a recurrence relation for this problem you're trying to compute a future cost for all remaining locations against your previously-stored solution up to some intermediate point. But in dynamic programming, computing that max cost could include points you've already visited. You could very easily create a cycle and an infinite loop here.
The other problem is ordering. Sure, you know you start at A and end at B, but nothing dictates which order we transition from point-to-point in between A and B - is it biggest to smallest? Left to right? Without an order we also have the opportunity to infinite loop if the cost pushes us into one direction that is cyclical.
As we saw in step 3 of our dynamic programming checklist, if we can't utilize a partial solution to help compute the state after the partial solution, the caching doesn't get us anywhere. In all of our previous examples we have an implicit ordering to evaluate things like strings or arrays. But when you have something like an undirected graph such as a network of cities on a map, you lose that order and you can't create a reliable recurrence relation.
### In efficiency
**Dynamic programming fails efficiency without order of the input object.** Again, without order, we don't have a reliable recurrence relation. Which means we can't express in faster polynomial time but the full exponential time. Since the partial results do nothing for us, our recursion is unbounded, and recursion blows up to exponential time in the worst case (see our naive fibonacci algorithm in [part 1](/blog/dynamic-programming-part-1/) for proof).
So yeah...if you have maps, networks, graphs without any sort of ordering...don't use dynamic programming.
## Concluding on Dynamic Programming
Dynamic programming is hard. You need to look at this stuff a bunch of times before it clicks. If you still need more examples, check out [this page](https://people.cs.clemson.edu/~bcdean/dp_practice/) for 10 more problems to practice. If you follow the three-step process of defining the recurrence relation, expressing the parameters in polynomial time, and storing the partial results in an array/matrix, you can pretty much solve any dynamic programming problem.
With enough examples and banging your head against the wall, these will start to become second nature (but you've got to practice!). And with that, onto the Daily Problem...
## Onto the next Daily Problem
> Eggs break when dropped from great enough height. Specifically, there must be a floor `f` in any sufficiently tall building such that an egg dropped from the `f`th floor breaks, but one dropped from the `(f − 1)`st floor will not. If the egg always breaks, then `f = 1`. If the egg never breaks, then `f = n + 1`.
> You seek to find the critical floor `f` using an `n`-story building. The only operation you can perform is to drop an egg off some floor and see what happens. You start out with `k` eggs, and seek to drop eggs as few times as possible. Broken eggs cannot be reused. Let `E(k,n)` be the minimum number of egg droppings that will always suffice.
> (a) Show that `E(1, n) = n`.
> (b) Show that `E(k,n) = Θ(n^1/k)`.
> (c) Find a recurrence for `E(k,n)`. What is the running time of the dynamic program to find `E(k, n)`?
## More problems to practice on
To wrap up dynamic programming the last problems in the chapter worth checking out from the homework set include:
1. Problem 8-18.
2. Problem 8-19.
]]>2018-11-08T23:30:00+00:00Dynamic Programming in JavaScript Part 2 - Examples
https://www.adamconrad.dev/blog/dynamic-programming-part-2/
Tue, 06 Nov 2018 23:30:00 +0000https://www.adamconrad.dev/blog/dynamic-programming-part-2/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/dynamic-programming-part-1/), check that out first.
## Answers to the previous Daily Problem
> Suppose you are given three strings of characters: `X`, `Y` , and `Z`, where `|X| = n`, `|Y| = m`, and `|Z| = n + m`. `Z` is said to be a _shuffle_ of `X` and `Y` iff `Z` can be formed by interleaving the characters from `X` and `Y` in a way that maintains the left-to-right ordering of the characters from each string.
> (a) Show that `cchocohilaptes` is a shuffle of `chocolate` and `chips`, but `chocochilatspe` is not.
> (b) Give an efficient dynamic-programming algorithm that determines whether `Z` is a shuffle of `X` and `Y` . Hint: the values of the dynamic programming matrix you construct should be `Boolean`, not numeric.
The way I'm approaching this is thinking about how my brain would solve it before finding the recurrence relation. So to answer A:
Two passes over the string array `Z`. First pass you are picking off letters from `X` until `X` is empty (i.e. so find `c`, now `X` is `hocolate`). If `X` is empty (meaning all letters were found _in order_), start over and run through `Z` again with the same heuristic but look for `Y` instead. If both `X` and `Y` are empty, you have found the shuffle. This will run in `O(Z)` time since you pass over `Z` twice. You won't find this shuffle with `chocochilatspe` because when `chips` is down to `ps` it will find `p` and then `s` will be left and the two variables `X` and `Y` will not be empty.
Answering B is, of course, the real exercise Skiena is looking for us to undergo. Remember, determining a dynamic programming problem involves 3 steps:
1. Create a recurrence relation or recursive algorithm to describe the problem.
2. The parameters for your relation should be polynomial (i.e. recursion can take expontential time, if the number of solutions is less than exponential, that's polynomial or less possible solutions and thus cheap enough to pursue)
3. Organize your alogrithm so that you can leverage your cached partial results when you need them to save time in exchange for space.
So answer A is not recursive...or is it? I now realize that the "picking off" of the variables can be done recursively by simply providing the function `X`, `Y`, and `Z` with the first letters chopped off. It might make more sense if I just write down the recursive algorithm now:
```js
const weave = (str, z, newIdx = 0) => {
if (!str.length) return z;
for (let i = newIdx; i < z.length; i++) {
if (str[0] === z[i]) {
// storing the partial results into the arguments of the recursive function = DP!
return weave(str.slice(1), z.slice(undefined,i).concat(z.slice(i+1)), newIdx);
} else {
newIdx++;
}
}
return false;
};
const isShuffle = (x, y, z) => {
return y === weave(x, z);
};
```
Before I explain the algorithm, the fact that I created a recursive solution with a defined number of arguments proves parts 1 and 2 of the dynamic programming requirements, so the real trick is to explain part 3.
The first trick I realized in writing this is that when you pick off all of the letters in `X`, if you have a shuffle, all that is left is `Y`, so we only ever need to "weave" in letters from `X` into `Z`! That means if the first word is picked off of the joint word, you can just compare the two strings for equality.
Now for the recursion. To recurse, we store the smaller versions of `X` and `Z` in the arguments of the function. To maintain order, we have to always check the first letter of `X` or else we don't maintain the rules of the shuffle. The annoying part comes with the larger word.
To store the partial results, this came down to removing the matched letter from the larger `Z` string _and_ ensuring when we start our `for` loop again within our recursed `weave` that we move along where we left off. This ensures that we only actually have to run this algorithm in `O(Z)` in one pass rather than 2 in answer A. In fact, if your `X` is solved before `Z` is empty, you already have `Y` so in the best case scenario `Z` is just `X.concat(Y)` in order. Therefore, you can actually solve this in the best-case sublinear time of `Ω(Z-Y)`.
Having fun yet? Let's keep exploring with more examples!
## Example 1: binomial coefficients
Everyone uses Fibonacci to talk about Dynamic Programming. The problem is if you only ever hear Fibonacci, your brain will train itself to remember the answer to calculating Fibonacci, and not necessarily how to apply Dynamic Programming.
Let's take another example from the book: binomial coefficients. This is just another way of saying:
> How many different ways can I count something?
So if I have the numbers `1,2,3,4` and I want to pick 2 numbers from them, how many pairs will that generate? The answer is 6, because I can grab `(1,2)`, `(1,3)`, `(1,4)`, `(2,3)`, `(2,4)`, and `(3,4)`.
A technique for quickly generating these coefficients is [Pascal's triangle](https://en.wikipedia.org/wiki/Pascal%27s_triangle) which you can read on your own for fun. Suffice it to say, it looks an awful lot like how you would generate a Fibonacci sequence. In this case, it's the sum of two numbers directly above it in the triangle.
We can map this idea out as a function for our binomial coefficient pretty quickly in code. Remember that for Fibonacci we iterated on our code three times: once with recursion, once with recursion and memoization, and finally a third with in-place memoization over a loop. To illustrate that we don't need recursion to make Dynamic Programming work, let's see binomial coefficients mapped out as a loop:
```js
const binomialCoefficient = (n, k) => {
const bc = [[]];
// fill along the edges of the triangle
for (let i = 0; i < n; i++) bc[i][0] = 1;
for (let j = 0; j < n; j++) bc[j][j] = 1;
// fill in the middle
for (let i = 1; i < n; i++)
for (let j = 1; j < i; j++)
bc[i][j] = bc[i - 1][j - 1] + bc[i - 1][j];
return bc[n][k];
}
```
You'll see that the bigger `for` loop is just the definition of the [recurrence relation](https://en.wikipedia.org/wiki/Recurrence_relation#Binomial_coefficients) for binomial coefficients.
You can see there's a pretty similar pattern here to the last Fibonacci algorithm we drew up. First, we fill in the default values and edge cases. Next, we iterate through our numbers to precompute values in an array. Finally, we draw values from that array to grab our coefficient value.
The only real difference is that we have to supply two values now instead of one and the math of computation is slightly varied given that the recurrence relation isn't exactly the same as that of Fibonacci.
So these two examples are pretty similar. For my last trick, let's go in a completely different direction to really cement this idea.
## Example 2: Checking for spelling errors
Did you know that you can create a spellchecker with Dynamic Programming? Yeah, you can! It's the same formula: try recursion, then memoize, then optimize. So what happens when you spell something? You're dealing with 4 scenarios for the spellchecker to validate:
1. **It matched.** Each letter of `soil` is evaluated and compared to be correct.
2. **It swapped.** For `soel`, you have to swap the `e` for an `i` when you get there.
3. **It added.** For `sol`, you have to add in the `i` when you get to `l` too early.
4. **It removed.** For `soile`, you have to remove the `e` when you expected the string to end.
And all but the first scenario carries a cost since the correct string requires no changes but each of the other three requires a change.
To map this out, we have a function that takes our correct word (from something like a dictionary) and compares it to the word we typed out. To do this we're going to create a matrix of all possibilities from how to get from a letter in our target word to a letter in our compared word. I'll write it all out now then we can explain it a bit more afterwards:
```js
const compareWords = (targetWord, comparedWord) => {
const costs = [0,0,0]; // each slot is associated with a MATCH, INSERT, or DELETE cost
const INSERT_COST = 0;
const DELETE_COST = 0;
const matrix;
let i, j, k; // counters
// initialize the matrix 2D array
// our first row
for (i = 0; i < targetWord.length; i++) {
matrix[0][i].cost = 0;
if (i > 0) {
matrix[0][i].parent = 'I'; // for insert cost
} else {
matrix[i][0].parent = -1; // no parent since we're in the top corner
}
}
// our first column
for (j = 0; j < compareWord.length; j++) {
matrix[j][0].cost = 0;
if (j > 0) {
matrix[j][0].parent = 'D'; // for deletion cost
} else {
matrix[j][0].parent = -1;
}
}
// determine costs
for (i = 1; i < targetWord.length; i++) {
for (j = 1; j < compareWord.length; j++) {
// match cost
costs[0] = matrix[i-1][j-1].cost + (compareWord[i] === targetWord[j] ? 0 : 1);
// insert cost
costs[1] = matrix[i][j-1].cost + 1; // always costs 1 to insert a letter into our target word
// delete cost
costs[2] = matrix[i-1][j].cost + 1; // always costs 1 to delete a letter from our target word
// assume match cost is cheapest at first
matrix[i][j].cost = costs[0];
matrix[i][j].parent = 'M'; // for an exact match being made
for (k = 0; k < 3; k++) {
if (costs[k] < matrix[i][j].cost) {
matrix[i][j].cost = costs[k];
matrix[i][j].parent = k === 0 ? 'M' : k === 1 ? 'I' : 'D';
}
}
}
}
// find the final cost at the end of both words
return matrix[targetWord.length-1][compareWord.length-1].cost;
}
```
So what is this really doing? There's really 4 parts to this solution:
### 1. Matrix initialization
We know that the edges along the rows and columns have a fixed cost at the extremes. When you compare a word to nothing, the cost scales linearly by letter as you add one more. And we know that a blank word carries no cost.
### 2. Cost calculation
Next, we compare the current position in the matrix to some previous adjacent cell. That leaves us with 3 calculations to compute:
1. **Moving up and to the left.** We assume we matched the previous cell (i.e. if we matched, both characters can advance) so all that is left to calculate is if the cell matches or not. It costs 0 for an exact match and 1 if we have to substitute one letter for the other.
2. **Moving up.** Our target character had to advance once without advancing our comparison word. In other words, we had to insert a character which carries a cost of 1 no matter what.
3. **Moving left.** Converseley, if our target character cannot advance but our comparison word did, we would need to delete a character to reduce matching costs which, again, has a price of 1 no matter what.
### 3. Optimize our path forward
Now that our costs are calculated in comparison to previous adjacent cells, we can find the minimum cost thus far. We just iterate through all of the cost enumerations and pick the cheapest one of the three. From here we can begin moving onto the next cell in the matrix.
### 4. Determine the total cost
Once we've enumerated through every cell in the cost matrix, we simply find the final cost at the very bottom of our matrix, which has compared every letter in our `targetWord` with every letter in our `compareWord`. That final cost is the cheapest cost to navigate from the target word to our comparison word.
## String comparisons solve a few problems
The beauty of this solution is that it can really work in a variety of other problems:
### Substring matching
Need to spellcheck just a word within a whole text? Initializing the first row removes the conditional to assign an insertion cost since we're not trying to build one string into the other. Then in step 4, rather than count the cost of the letter in the big text, we just find the cost at the last point in our substring rather than the whole text.
### Longest Common Subsequence (LCS)
This is one of those classic problems in Dynamic Programming: you want to find all of the shared characters between two words if you only advance character-by-character. The example in the book uses `democrat` and `republican` with a LCS of `eca` (i.e. `e` is followed by a `c` is followed by an `a` in each of the two words, and it's of length 3 which is longer than any single letter in both words and is longer than, say, `ra` which is also a subsequence in the two words).
Leveraging our existing code, if we just make the cost of a substitution in our match calculation `2` instead of `1`, the cost is disproprotionately greater than any other cost and we'd never want to swap out letters.
Why is this the solution? Think about it this way: all we ever want are matches. We want all of the other letters out of our way, which either means insertion or deletion just to get the exact sequence of something like `LETTER-e-LETTER-LETTER-c-LETTER-LETTER-a`. The least number of insertions/deletions to form an equal patter between our target and comparison words gets us our common subsequence.
## Example 3: Partitions
Imagine you have a bunch of books in a library. The head librarian has already given you all of the books organized by number of pages in the book. He then gives you a few book dividers to segment the books into sections, ideally as uniformly as possible. How would you do it?
### A naive approach
Let's say the example is super basic: 2 dividers (3 sections) with 9 books: 100 pages, 200 pages...all the way up to 900 pages. The naive way to handle this is to divide the sum of all pages of all books by the number of sections and add in your divider when your ideal division is exceeded by the next book.
So with our example we have 4,500 total pages over 9 books to divide in 3 sections, which makes each section ideally 1,500 pages. The first section creeps over 1,500 pages after book 5, so we put a divider there and it helps that we hit exactly 1,500 pages in this section. The next section creeps over after book 7, so we put the next divider there but we only have 1,300 pages in this section. We've used up all dividers, so we're left with our last section having books 8 and 9, which has a division of 1,700 pages.
This works because while 1,500 is the ideal, the last two sections have a delta of only 200 pages in either direction, so no section is worse than the other and there's no other possible configuration that would further minimize the cost from the maximum distance of the ideal.
Of course, this is a pretty ideal example and we haven't applied any dynamic programming principles to achieve this result, which means we know there is more on how to approach this. Can we leverage any of this previous knowledge to improve our algorithm?
### Using dynamic programming
The truth is we need an exhaustive search involving recursion to solve this problem. Describing this problem with recusion and storing partial results along the way gives us our dynamic programming solution. How will we manage this?
Remember that to solve the full problem, we should be able to describe a subset of the problem. In other words, if we solved the above example with only 8 books, that should give us information to help solve the total solution with 9 books. And in the other direction, that should should be based on a solution when we only have the first 7 books. What is the cost of adding this last divider?
#### Maximum sectional cost
That cost is the sum of the remaining elements. The cost we would choose is the maximum of either this current section or the largest section thus far. Why? Either we are breaking new ground and we've forced an increase in our total worst cost per section, or we have remained within the cost cap we have defined in all previous sections. This is like if our last book in our example is 1,000 pages instead of 900 - the cost is 1,800 pages which is 300 pages from our ideal instead of the previous delta which is 200 pages (i.e. 600 + 700 = 1,300 which is 200 pages off of the first section at the ideal 1,500 pages).
#### Minimum sectional distance
Now we mentioned in the previous example that while we're totalling some maximum, we want to minimize that delta from the ideal, so how do we minimize the total cost per section? Since we have `k` sections (and `k-1` dividers), we need to minimize the cost with the remainder of books if we have `k-1` sections (or `k-2` dividers). Which is to say, we need to know if the previous remainder was trending in the right direction to set us up for a reasonable maximum with our current last divider available. And because we are trying to solve this as a subset of the previous problem, _this is our recurrence relation_.
We've got enough words here and not enough code, so let's try and write out what we have above:
```js
const printBooks = (books, start, end) => {
books.forEach(book => console.log(book));
console.log('\n');
};
const reconstructPartition = (books, dividers, length, divisions) => {
if (divisions === 1) return printBooks(books, 1, length);
reconstructPartition(books, dividers, dividers[length][divisions], divisions-1);
return printBooks(books, dividers[length][divisions]+1, length);
};
const partitionize = (books, divisions) => {
let length = books.length;
let allSums = [0];
let cachedValues, dividers, cost;
// collect the cumulative sums
for (let i = 1; i <= length; i++) allSums[i] = allSums[i-1] + books[i];
// boundary condition - one division is just the total sum
for (let i = 1; i <= length; i++) cachedValues[i][1] = allSums[i];
// boundary condition - one value across all sums is just one section
for (let j = 1; j <= divisions; j++) cachedValues[1][j] = books[j];
// store partial values as we walk the recurrence relation
for (let i = 2; i <= length; i++) {
for (let j = 2; j <= divisions; j++) {
cachedValues[i][j] = Number.POSITIVE_INFINITY; // can't get worse than the max cost
for (let x = 1; x <= i-1; x++) {
cost = Math.max(cachedValues[x][j-1], allSums[i]-allSums[x]);
if (cachedValues[i][j] > cost) {
cachedValues[i][j] = cost;
dividers[i][j] = x;
}
}
}
}
return reconstructPartition(books, dividers, length, divisions);
};
```
The code here should be self-explanatory: it is meant to work out the algorithm above. The real key is iterating through all of the books and all of the divisions to get the minimum cost after finding the maximum between the previously cached value and the current sum into the partition. If we hit that, increment our dividing line and store the next cached value.
Study it a few times and if you have any questions feel free to hit me up on Twitter. We've covered a few examples here and there's a bunch to unpack so if you need to read over these a few times that's okay!
## Onto the next Daily Problem
> A certain string processing language allows the programmer to break a string into two pieces. It costs `n` units of time to break a string of `n` characters into two pieces, since this involves copying the old string. A programmer wants to break a string into many pieces, and the order in which the breaks are made can affect the total amount of time used. For example, suppose we wish to break a 20-character string after characters 3, 8, and 10. If the breaks are made in left-right order, then the first break costs 20 units of time, the second break costs 17 units of time, and the third break costs 12 units of time, for a total of 49 steps. If the breaks are made in right-left order, the first break costs 20 units of time, the second break costs 10 units of time, and the third break costs 8 units of time, for a total of only 38 steps. Give a dynamic programming algorithm that takes a list of character positions after which to break and determines the cheapest break cost in `O(n^3)` time.
## More problems to practice on
For even more dynamic programming practice problems, check out Problem 8-14 in the book.
]]>2018-11-06T23:30:00+00:00Star athlete or engineer? The Big Five in tech are paying like the Big Four in sports
https://www.adamconrad.dev/blog/the-big-four-sports-or-tech-companies/
Sat, 03 Nov 2018 23:30:00 +0000https://www.adamconrad.dev/blog/the-big-four-sports-or-tech-companies/2018-11-03T23:30:00+00:00Dynamic Programming in JavaScript Part 1 - The Basics
https://www.adamconrad.dev/blog/dynamic-programming-part-1/
Thu, 01 Nov 2018 23:30:00 +0000https://www.adamconrad.dev/blog/dynamic-programming-part-1/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/shortest-paths/), check that out first.
## Answers to the previous Daily Problem
> _Multisets_ are allowed to have repeated elements. A multiset of `_n_` items may thus have fewer than `_n!_` distinct permutations. For example, `{1,1,2,2}` has only six different permutations: `{1, 1, 2, 2}, {1, 2, 1, 2}, {1, 2, 2, 1}, {2, 1, 1, 2}, {2, 1, 2, 1}, and {2,2,1,1}`. Design and implement an efficient algorithm for constructing all permutations of a multiset.
This question is basically asking "give me all of the shapes of these items". Each permutation is a shape, or arrangement, of the 4 components, 2 of which are similar.
One approach would be to leverage backtracking, trying to figure out from those end results how do you get to the result with `_n-1_` values left. This is a call back to our previous articles to re-assert your knowledge of backtracking algorithms. I will offer an alternative approach below:
Another naive approach is to make an `_n^2_` loop between the number of possibilities and the number of slots and just insert each item into each slot. So 1 goes to first slot, then second slot, then third...then the second one goes into the first slot, then second slot... and you build out the possibilities by filling in the remaining options.
The problem is that doing the example above would arrive at duplicates (inserting the first 1 into the 4 slots yields the same shape of numbers as would if you inserted the second 1 into the 4 slots). So how do we eliminate these possibilities?
**Caching.** As we'll see in this article, when we cache previous results, we can throw out unnecessary moves like constructing an unnecessary permutation. Once we've inserted the four options for the first 1, we don't ever have to calculate them again for any other 1s because they yield the same output. Same with the 2s. So in the multiset example above, there are 24 different arrangements, duplicated twice (once for the 1 and once for the 2). In other words, we have `6x2x2 = 24`, or 6 permutations that are duplicated once for `1` (x2) and once again for `2` (x2).
If that doesn't make sense to you, check out how we handle generalized recurrence relations below.
## Dynamic Programming = Recursion + Memoization
Functions that recurse call themselves. _Memoization_ is just storing previously-computed results. Honestly, the best way to think about this is with a classic example, the Fibonacci Sequence:
```js
const fibonacci = (n) => {
if (n === 1) return 0;
if (n === 2) return 1;
return fibonacci(n - 1) + fibonacci(n - 2);
}
```
All this algorithm says is
> To get the number you're looking for, add up the two previous numbers in the sequence
The first few numbers of Fibonacci are `0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55`. Simple, right? What's the 24th number of Fibonacci? Oof! Yeah, this one gets rough pretty quick (in fact, it is _exponentially_ bad!)
So we've got the recursive part in this function, but we're missing the memoization. If we add in memoization, this problem becomes virtually linear in time, with the tradeoff of an array to store previous results:
```js
const fibonacciCached = (n, cache = [0, 1]) => {
if (n === 1) return 0;
if (n === 2) return 1;
if (!cache[n]) cache[n] = fibonacciCached(n - 1) + fibonacciCached(n - 2);
return cache[n];
}
```
Literally one extra line to cache the result. The first solution will heat up your laptop if you find the 45th Fibonacci number from how hard it will be working. The latter will run extremely fast. All we did was make sure that if we've found this number before, we look it up in an array (constant-time lookup) as opposed to computing it all over again.
Simple, right? Well, not exactly.
For starters, this only works if you truly refer to the things you've computed. It works great here because we always refer to all of the numbers computed (since it's always referring to lower numbers). But what about backtracking a path or with depth-first search? You may not always revisit everything there, in which case you've stored things unnecessarily.
Recursion also makes things painful. Recursion really eats up stack space. If we can cache with primitives and not recurse, that's even better. With Fibonacci, it's pretty easy:
```js
const fibonacciArr = (n, cache = [0, 1]) => {
for (i = 2; i <= n; i++) cache[i] = cache[i - 1] + cache[i - 2];
return cache[n];
}
```
Brilliant! The shortest and the fastest solution yet with _`O(n)`_ time and space! Now, to get really weird, we have to remember one thing that you probably noticed as you were counting Fibonacci: you only ever looked back two numbers when finding the next. Could our computer do that and only have to store 2 numbers as opposed to _n_? You bet!
```js
const fibonacci2 = (n) => {
let first = 0, second = 1, next;
if (n <= 1) return 1;
for (let i = 2; i < n; i++) {
next = first + second;
first = second;
second = next;
}
return first + second;
}
```
A couple of more lines, but we all we're doing is computing the next number by adding the two previous numbers (like we first specified in our definition of Fibonacci). To save space, we overwrite the variables and keep chugging along. The solution still runs in _`O(n)`_ time but the space is now constant (the two recorded numbers plus the pointer to our upcoming to-be-computed number).
One final thing I'll say about these last two examples: you may have noticed we traded in the full caching solution for less space. If constant space and linear time is fine with you (which it probably is) then stick with this.
However, if you plan to call a function over, and over, and over again, permanently storing a singleton of the cached values, it might be better to call something like `fibonacciCached` since you can always check if someone else has done this work before. The storage space is linear while the lookup time is constant for all but the uncharted territory. If you expect to visit previous values often, consider this alternative.
## The basics are just scratching the surface
We've now introduced the fundamental aspects of Dynamic Programming with a few examples and variations. Read over the code a few times to let it sink in (trust me, it took a _really_ long time for me to see this). We've got 2 more parts of this chapter to cover, so in the meantime, take a look at the Daily Problem and we'll see you soon!
## Onto the next Daily Problem
> Suppose you are given three strings of characters: `X`, `Y` , and `Z`, where `|X| = n`, `|Y| = m`, and `|Z| = n + m`. `Z` is said to be a _shuffle_ of `X` and `Y` iff `Z` can be formed by interleaving the characters from `X` and `Y` in a way that maintains the left-to-right ordering of the characters from each string.
> (a) Show that `cchocohilaptes` is a shuffle of `chocolate` and `chips`, but `chocochilatspe` is not.
> (b) Give an efficient dynamic-programming algorithm that determines whether `Z` is a shuffle of `X` and `Y` . Hint: the values of the dynamic programming matrix you construct should be `Boolean`, not numeric.
## More problems to practice on
Now that [homework 5](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw5.pdf) is out, here are a few problems that are relevant to the dynamic programming we've discussed so far:
1. Problem 8-3.
2. Problem 8-7.
Think on this and we'll check it out in the next few days as we explore more examples of Dynamic Programming.
]]>2018-11-01T23:30:00+00:00Combinatorial Optimization in JavaScript
https://www.adamconrad.dev/blog/combinatorial-optimization-in-js/
Tue, 30 Oct 2018 23:30:00 +0000https://www.adamconrad.dev/blog/combinatorial-optimization-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/maximum-flow-algorithms-networks-js/), check that out first.
By working backwards and using heuristics to _approximate_ a correct solution, you can get close enough to a correct solution that is acceptable without having to run your computers for all of eternity. Before we get into that, let's answer the previous article's Daily Problem:
## Answers to the previous Daily Problem
> Let `G = (V,E)` be a directed weighted graph such that all the weights are positive. Let `v` and `w` be two vertices in `G` and `k ≤ |V|` be an integer. Design an algorithm to find the shortest path from `v` to `w` that contains exactly `k` edges. Note that the path need not be simple.
Keep track of all `i` hops to all `i` vertices. The shortest path for `k` edges is just the shortest path for `k-1` edges plus some edge in our `i`. We can _backtrack_ recursively from the end towards the beginning, appyling the lightest-weight edge as we head backwards towards our starting vertex.
This is the basis for how we will approach problems going forward.
## Backtracking as a means of defining heuristics
Have you ever played Sudoku? Then you can understand backtracking! Backtracking, as we alluded to earlier, is the systematic way of going through all possible configurations of a search space, whether it is how they are arranged (_permutations_) or how they are divided (_subsets_). At each step in backtracking, we are trying to solve a portion of the total problem.
Let's say we want to get from one end of the graph to the other; point A to point Z. And further we know that there are 5 different ways to get there. What backtracking will do is figure out all of the ways to get from A to Z by adding 1 more point to the solution. It will then ask "did we get to Z?" if it did, we can print out the solution, if not we have to see if we hit a dead end or if we can extend it further.
Since we're going _deeper_ into a list of possibilities leading from A to Z, it makes sense that we could implement backtracking with DFS. BFS also works since we are trying to find all possible solutions (i.e. going wide will accomplish the goal just as going deep will), but it takes up much more space since we're guaranteeing to hit all of the dead ends rather than only going deep on the ones that get us closer from A to Z.
I was going to post some code that offers a way of explaining backtracking, but to be honest after looking over the algorithm it's much easier to understand as a recipe list and then apply that to a specific problem. Therefore, let's generalize backtracking into this recipe:
1. For a given solution `A`, if it's a complete solution that gets us from A to Z, print it! If we haven't found a solution yet, move on to step 2
2. Go one level deeper - given `A`, add one more step to your partial solution. This is like going to the next fork in the road.
3. Now go down each fork in the road and proceed to step 1 with each fork as the new `A`. So if we hit a fork in the road with 3 more streets to travel down, you're re-running backtrack on `A'1`, `A'2`, and `A'3`. If we've done all of the exploring possible or hit a dead end, proceed to step 4.
4. Now that we've explored, we need to move out of where we've traveled - either to explore another road because our previous one has terminated, or to end the algorithm because we've explored all roads.
And that's the basic gist of backtracking with DFS. From here we'll look at a specific example that I mentioned earlier: Sudoku.
## Backtracking by example: Sudoku
You've heard of [Sudoku](https://en.wikipedia.org/wiki/Sudoku), right? You wouldn't guess it from the name, but this _French_ puzzle game is inherently combinatorial in nature. The object of the game is to take a 9×9 grid of blanks and fill it with the digits 1 to 9. The puzzle is completed when every row, column, and sector (3×3 subproblems corresponding to the nine sectors of a tic-tac-toe puzzle) contain the digits 1 through 9 with no deletions or repetition.
Backtracking is an excellent way to solve this problem because the game starts partially filled in. That is, Step 1 is already filled out for us (we have a given partial solution `A`) and we need to take incremental steps to get to a completed board. To begin, we need to answer 3 questions:
1. **How do we set up this problem?** Since we have a 2D board matrix, a 2D array will suit us well in JavaScript. And what do we put in these array values? The state of the board, which we hope to fill out completely by the end of the exercise.
2. **What are the possibilities at each step to evaluate?** The set of values that haven't been taken yet in a given row, column, or sector.
3. **How do we know when to backtrack?** As soon as we are out of possibilities from Question 2 (i.e. we either have filled out the board or we have an invalid board).
With all of that in mind, we can now create a custom backtracking algorithm to solve a Sudoku board!
```js
class SudokuSolver {
const UNASSIGNED = 0;
const DIMENSION = 9;
const SECTOR = Math.sqrt(DIMENSION);
constructor(board) {
this.solve(board);
}
solve(board, row, col) {
const [row, col] = this.findUnassignedLocation(row, col);
// step 1 - print solution else find it
if (row === -1) return this.processSolution(board);
// step 2 - DFS; go one level deeper
for (let number = 1; number <= DIMENSION; number++) {
if (this.isValid(row, col, number)) {
board[row][col] = number;
// step 3 - if we've hit a valid spot to explore, go in!
if (this.solve(board, row, col)) return true;
board[row][col] = UNASSIGNED;
}
}
// step 4 - we've done everything; time to backtrack!
return false;
}
findUnassignedLocation(board, row, col) {
let assigned = false;
let currentLocation = [-1, -1];
while (!assigned) {
if (row === DIMENSION) {
assigned = true;
} else {
if (board[row][col] === UNASSIGNED) {
currentLocation = [row, col];
assigned = true;
} else {
if (col < 8) {
col++;
} else {
row++;
col = 0;
}
}
}
}
return currentLocation;
}
isValid(board, row, col, number) {
return (
this.isValidRow(board, row, number) &&
this.isValidColumn(board, col, number) &&
this.isValidSector(
board,
Math.floor(row / SECTOR) * SECTOR,
Math.floor(col / SECTOR) * SECTOR,
number
)
);
}
isValidRow(board, row, number) {
for (var col = 0; col < DIMENSION; col++) {
if (board[row][col] === number) return false;
}
return true;
}
isValidColumn(board, col, number) {
for (var row = 0; row < DIMENSION; row++) {
if (board[row][col] === number) return false;
}
return true;
}
isValidSector(board, row, col, number) {
for (var currentRow = 0; currentRow < SECTOR; currentRow++) {
for (var currentCol = 0; currentCol < SECTOR; currentCol++) {
if (board[row + currentRow][col + currentCol] === number) return false;
}
}
return true;
}
processSolution(board) {
let boardStr = '';
for (let i = 0; i < DIMENSION; i++) {
for (let j = 0; j < DIMENSION; j++) {
boardStr += grid[i][j];
}
boardStr += '\n';
}
console.log(boardStr);
}
}
```
Alright, lots to unpack here! Let's go through the four steps one-by-one and see how we're achieving backtracking for our Sudoku solver:
### 1. Determine if the Sudoku board is solved
It may seem counterintuitive to check if you've solved the solution before you've actually gone and solved it, but we're doing this first in the algorithm because we don't want to go down any unnecessary roads. Remember, algorithms are as much about correctness as they are efficiency.
So before anything else, we need to see if we have anymore unassigned locations. That just means asking our program "are there any spots on the Sudoku board left to be filled?" If not, we process a solution (which just means printing out the items in our 2D array) otherwise we add our next step and dive in.
### 2. Loop through the numbers
Now that we have a cell, we need to see which number(s) fit into that cell. Since there are 9 numbers on a Sudoku board to work with, we iterate through all of those; and that's where the DFS part comes in.
### 3. Dive into valid configurations
Let's say we know that 3, 5, and 8 are all valid numbers that could go in our current cell. The DFS recursion basically says "let's assume we fill out our board with the 3 here, will we be able to fill out the rest of our board?" And we try to fill out that board in the universe where 3 is in this cell.
### 4. Assign or backtrack
If we reach the end state, we're done. If we reach an invalid board, then we know we have to backtrack through this whole recursive dive until we get back to where we set the 3. We then try setting the 5, and go through the whole thing again, filling out all of the remaining slots until we either reach the end state or an invalid one. Repeating with all of the valid candidates until there are no more.
And that's it! Combinatorial optimization sounds like a mouthful when taken at face value. But in reality, it's just doing a depth-first search on all possibilities until we find the right one (or ones) to solve our problem!
## Onto the next Daily Problem
Now we can apply our skills with backtracking into the Daily Problem:
> _Multisets_ are allowed to have repeated elements. A multiset of `_n_` items may thus have fewer than `_n!_` distinct permutations. For example, `{1,1,2,2}` has only six different permutations: `{1, 1, 2, 2}, {1, 2, 1, 2}, {1, 2, 2, 1}, {2, 1, 1, 2}, {2, 1, 2, 1}, and {2,2,1,1}`. Design and implement an efficient algorithm for constructing all permutations of a multiset.
## More problems to practice on
Implementing a heuristic for the [graph bandwidth problem](https://en.wikipedia.org/wiki/Graph_bandwidth) is the recommended way from the class to start diving into combinatorial search problems.
The [Eight-Queens Problem](https://medium.freecodecamp.org/lets-backtrack-and-save-some-queens-1f9ef6af5415) is another classic combinatorial backtracking algorithm to implement if you're looking for something slightly different.
Finally, the class-sponsored [Homework 4](https://www3.cs.stonybrook.edu/~skiena/373/hw/hw4.pdf) focuses purely on combinatorial search.
If you have an implementation to any of these, be sure to [share it with me on Twitter](https://twitter.com/theadamconrad)!
]]>2018-10-30T23:30:00+00:00Maximum Flow Algorithms for Networks in JavaScript
https://www.adamconrad.dev/blog/maximum-flow-algorithms-networks-js/
Tue, 23 Oct 2018 23:30:00 +0000https://www.adamconrad.dev/blog/maximum-flow-algorithms-networks-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/shortest-paths/), check that out first.
## Answers to the previous Daily Problem
> Let `G = (V, E)` be an undirected weighted graph, and let `T` be the shortest-path spanning tree rooted at a vertex `v`. Suppose now that all the edge weights in `G` are increased by a constant number `k`. Is `T` still the shortest-path spanning tree from `v`?
The answer is no, and it will make sense with a simple example.
Suppose you have a triangle with two edges equal to 3 each and one edge equal to 7 (for all of your mathematicians out there, this is a contrived example so don't message me that these edge weights couldn't possibly construct a real triangle). The shortest path is going to be along the 2 3-weighted edges (with a total path weight of 6).
But if you add 10 to all 3 edges, you now have 13, 13, and 17, respectively. Now 17 is the shortest path. What was once the longest edge is now the shortest path, because if you went along the original shortest path, you'd travel 13+13 for a total path weight of 26.
## Maximum flow and bipartite matching
We mentioned in a previous article about what a _bipartite graph_ is, but at a high level, it's just a way to divide a graph into two distinct sections. For example, a set of job vertices and a set of people to do those jobs (also represented as vertices). How can we connect these two sides in a way that the maximum amount of work gets done?
This is called the _network flow problem_ because we want to be able to maximize the capacities from one side to the other so that it flows with maximum efficiency and capacity.
### Why is this important to you?
Maximum flow algorithms are often solved with something called _augmented paths_. Think of it like pouring coffee through a filter or a clogged drain. If one pipe is running slow or only has so much capacity, wouldn't you divert more water to the bigger pipe or the one that isn't clogged?
Or imagine you're a programmer on the city utilities division and you need to figure out how much water you can pump out to the town from a desalinization plant. How do you figure out where the water can go without wasting it or fully utilizing all of the available pipes? **As a front-end developer, you might have to map this relationship on a visualization.**
### Residual graphs
We can use something called a **residual graph** to figure out this network. A residual graph is like a regular graph but it also includes _flow_ and _capacity_. So if we look at a graph `G` and we layer on a residual graph `R`, we can now see, for every edge, how much flow can be pushed along an edge and, based on the weight, what the capacity that can be applied as well. With all of this extra data, we can find the inefficiencies as well as our limits.
Additionally, we can see _what direction_ we can make use of the weights of an edge. For example, if to travel from `a` to `b` along edge `(a,b)` the weight is 12, we might find out from our residual graph that from `a->b` our flow is 10 and from `b->a` our flow is only 2. Whereas another edge may reveal that regardless of direction, the price is 5. These are very key points when trying to figure out how to best utilize our network.
In fact what this tells us is that the flow on our 2nd edge is maximized, and that our primary edge of `(a,b)` has some extra capacity to utilize. How much capacity? The _minimum cut_ of that edge. Let's look at that again. The weight of `(a,b)` is 12, with one cut at 10 and one cut at 2, so **the amount we can optimize is by the minimum cut** (in this case, 2). In fact we can generalize this augmented paths problem into the algorithm represented below:
```js
class Vertex {
constructor(
capacity = 0,
flow = 0,
neighbor = null,
nextVertex = null,
residualCapacity = 0
) {
this.capacity = capacity;
this.flow = flow;
this.neighbor = neighbor;
this.nextVertex = nextVertex;
this.residualCapacity = residualCapacity;
}
}
class Graph {
resetSearch() {
for (let i = 0; i < this.vertices.length; i++) {
PROCESSED[i] = false;
VISITED[i] = false;
PARENTS[i] = null;
}
}
findEdge(start, end) {
let path = this.connections[start];
while (path) {
if (path.adjacencyInfo === end) return path;
path = path.nextVertex;
}
return path;
}
augmentPath(start, end, parents, volume) {
if (start === end) return;
let edge = this.findEdge(parents[end], end);
edge.flow += volume;
edge.residualCapacity -= volume;
edge = this.findEdge(end, parents[end]);
edge.residualCapacity += volume;
this.augmentPath(start, parents[end], parents, volume);
}
pathVolume(source, sink, parents) {
if (parents[parents.length-1] === -1) return 0;
let edge = this.findEdge(parents[sink], sink);
if (source === parents[sink]) {
return edge.residualCapacity;
} else {
return Math.min(
this.pathVolume(source, parents[sink], parents),
edge.residualCapacity
);
}
}
edmondsKarp(source, sink) {
this.addResidualEdges();
this.resetSearch();
this.bfs(source);
let volume = this.pathVolume(source, sink, this.PARENTS);
while (volume > 0) {
this.augmentPath(source, sink, this.PARENTS, volume);
this.resetSearch();
this.bfs(source);
volume = this.pathVolume(source, sink, this.PARENTS);
}
}
}
```
In this case, the `source` is an edge of a bipartite graph (let's say one section is `L` and the other section is `R`) and the `sink` is the edge in the other section of the graph. These edges are special because they are connected to all other edges on their side by a weight of 1. Their job is to provide easy access to all edges in the graph and map out all of the main paths and find the efficiencies. We do this via breadth-first search to look for any path from the `source` to the `sink` that increases the total flow. The algorithm, known as Edmonds-Karp, is done when we have no more extra volume left to optimize.
> Note: for the very astute, you might recognize this as the [Ford-Fulkerson method](https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm). Though not fully recognized as an algorithm, Edmonds-Karp is an _implementation_ of Ford-Fulkerson for maximum flow problems on networks.
Now that we've seen a bunch of algorithms for moving around a graph, here's a few things to keep in mind:
1. **Map the problem. Solve with an algorithm.** Think of these algorithms as your ace-in-the-hole. They're your ammunition for firing at problems. But you need to know what to fire at.
2. **Create the framework.** All problems can be fit into some sort of framework. Once you know the problems, the solutions are easy since they've already been given to you.
3. **Practice.** Problems are easier to recognize and slot into that framework if you see a lot of them. Make sure you hit the books and give some coding problems a try!
And with that last tip, let's get on to the Daily Problem and apply some of these recent algorithmic concepts!
## Onto the next Daily Problem
> Let `G = (V,E)` be a directed weighted graph such that all the weights are positive. Let `v` and `w` be two vertices in `G` and `k ≤ |V|` be an integer. Design an algorithm to find the shortest path from `v` to `w` that contains exactly `k` edges. Note that the path need not be simple.
## More problems to practice on
1. Problem 6-24.
2. Problem 6-25.
]]>2018-10-23T23:30:00+00:00How the New York Times uses React
https://www.adamconrad.dev/blog/how-the-new-york-times-uses-react/
Mon, 22 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/how-the-new-york-times-uses-react/2018-10-22T10:28:00+00:00Why you should start a newsletter even if you never grow a big audience
https://www.adamconrad.dev/blog/why-you-should-start-a-newsletter-even-if-you-never-grow-a-big-audience/
Sat, 20 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/why-you-should-start-a-newsletter-even-if-you-never-grow-a-big-audience/2018-10-20T10:28:00+00:00Shortest Path Problems in the Real World
https://www.adamconrad.dev/blog/shortest-paths/
Thu, 18 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/shortest-paths/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/minimum-spanning-trees/), check that out first.
## Answers to the previous Daily Problem
> Suppose we are given the minimum spanning tree `T` of a given graph `G` (with `n` vertices and `m` edges) and a new edge `e = (u, v)` of weight `w` that we will add to `G`. Give an efficient algorithm to find the minimum spanning tree of the graph `G + e`. Your algorithm should run in `O(n)` time to receive full credit.
Prim's or Kruskal's will suffice in solving this, but to run this in linear time we'd probably prefer Kruskal's. A graph `G + e` is no different to solve than `G` since `G` is just a subtree of `G + e`. So why should the algorithm change? Kruskal's can be used as is, but here's the distinguishing factors to look out for:
* If `e` is a backward edge, we ignore it because it's not helping us explore more of the tree. _But_ this is important because if we detect a cycle, we've now found the edge we can throw out (if it's not our current edge `e`) in our DFS exploration.
* If `e` is a forward edge, we also ignore it for the same reason, since forward edges have already been explored.
* If `e` is a tree edge, we have to compare it against all other edges leading to `v` to determine if we have to include it in our MST. If `e`'s weight `w` is less than that of the other incident weights, we will add `e` to our MST for `G + e`. Otherwise our MST hasn't changed from graph `G`.
* If `e` is a cross edge, it doesn't factor into the MST and we can safely ignore it.
This runs in `O(n)` time because our DFS to find the new edge only costs `O(n)` in a sparse graph, and once we're there it's just some constant-time operations to do comparisons to see if the new edge will be swapped into the MST or not.
## Applications for shortest paths
As we saw above, transporation problems (with solutions like Google Maps, Waze, and countless others) are a prime example of real-world applications for shortest path problems. There are a few others to consider as well if you aren't convinced yet.
### Motion planning
Have you ever used a [flip book](https://www.youtube.com/watch?v=5A0Ro4vj3KM) to animate a scene? How would you animate someone walking in that book? We all have an idea in our head as to what a human looks like when they walk. At each page of the flip book, you're using the path of the limbs to anticipate the next frame. **Flip book animation is like a shortest path problem.**
When we flip between frames in a flip book, to get to the next one, we're having our character move in the most natural (i.e. shortest) path from one point in space to the next. If you swing your leg up, it's not going to move erratically. Instead, it will move in a smooth motion, traveling along an arc that is provided in part by the contraction of your muscles and the anatomy of your bones to allow for that motion.
### Communication networks
What's the fastest way to send an email? Where's the best place to provide a CDN of your images and JavaScript? These are both shortest path problems. The answers lie in distributed networks such as Amazon Web Services and relay networks of mail servers.
This is why, for example, you are asked to choose where you want your servers to live on AWS. If you are starting a blog that caters to local businesses in Boston, it's going to be faster to serve them images and content from the `us-east` region instead of `ap-southeast`. Information travels pretty fast, but even basic spacial reasoning can convince us it will take less time to travel to our Boston customers from servers in Ohio or Virginia than from servers in Singapore or Sydney.
## How do we solve shortest path problems?
So given all of these kinds of applications, how would we go about beginning to solve them? **For unweighted graphs, BFS is sufficient.** Since all edges have equal weights, it doesn't matter how we get from A to B, just that we get there in the fewest hops, and BFS will be able to document that for us in `O(n + m)` which is linear and very efficient. But as we saw with MSTs, unweighted graphs aren't very interesting problems.
Weighted graphs are much more challenging to solve. BFS is insufficient for solving weighted graphs for shortest paths because **BFS can find _a_ short path but not the optimal shortest path.** This is because BFS could find you the path with the least weight, but requires you to traverse the most number of edges.
This is like how when you put into Google Maps for the shortest timed route, it will have you taking all of these weird shortcuts even though you know that there are more direct routes with less turns and stop signs (but probably more traffic). So how _do_ we solve the shortest path problem for weighted graphs? We're going to explore two solutions: **Dijkstra's Algorithm** and the **Floyd-Warshall Algorithm**.
### Dijkstra’s Algorithm
Dijkstra's is the premier algorithm for solving shortest path problems with weighted graphs. It's also an example of **dynamic programming**, a concept that seems to freak out many a developer.
Dynamic programming is another divide-and-conquer technique where we use the results of a subproblem in order to help answer the general problem we are trying to solve. That's it!
Dijkstra's is a dynamic programming application because if we have a path from `s->v->e` where `s` is the starting vertex and `e` is the ending one, we know that there is a middle vertex `v` such that there is a shortest path between `s->v`.
In other words, we can step back from `e` all the way back to `s` with subproblems of saying "so if we know want to know the shortest path from `s->e`, can we compute the shortest path from `s->(e-1)`? If we can find that, can we computer the shortest path from `s->(e-2)`?" and so on, until we've reached the shortest path from `s` to it's closest neighbor. _This_ is applying dynamic progamming in the form of Dijkstra's Algorithm.
```js
class Graph {
// rest of structure from previous articles
let ADDED, DISTANCES, PARENTS;
dijkstra(startVertex) {
ADDED = new Array(this.vertices.length).fill(false);
DISTANCES = new Array(this.vertices.length).fill(Number.POSITIVE_INFINITY);
PARENTS = new Array(this.vertices.length).fill(-1);
DISTANCES[startVertex] = 0;
let currentVertex = startVertex, currentEdge;
while (!ADDED[currentVertex]) {
ADDED[currentVertex] = true;
currentEdge = this.connections[currentVertex];
while (currentEdge) {
let nextVertex = currentEdge.adjacencyInfo;
let weight = currentEdge.weight;
if (DISTANCES[nextVertex] > (DISTANCES[currentVertex] + weight)) {
DISTANCES[nextVertex] = DISTANCES[currentVertex] + weight;
PARENTS[nextVertex] = currentVertex;
}
currentEdge = currentEdge.nextVertex;
}
currentVertex = 1;
let bestCurrentDistance = Number.POSITIVE_INFINITY;
for (let i = 0; i < this.vertices.length; i++) {
if (!ADDED[i] && bestCurrentDistance > DISTANCES[i]) {
bestCurrentDistance = DISTANCES[i];
currentVertex = i;
}
}
}
}
}
```
Does this algorithm look familiar? It should. Dijkstra's Algorithm is, in fact, _extremely_ similar to [Prim's Algorithm](/blog/minimum-spanning-trees-in-js/) from the last article. In fact, it is so similar, I only had to change 3 lines (1 of which was the name of the function). The only real difference between Prim's and Dijkstra's is in how they compare distances.
In Prim's, we check to see if the next vertex's distance is greater than the current edge weight and if it has been added yet. If it hasn't, then we set the distance to the next vertex equal to that current edge weight and make the current vertex the parent of the next.
In Dijkstra's, all we do differently is check to see if the next vertex's distance is greater than _the current edge weight PLUS the distance of the current vertex_. That summed value is what gets added to the distance array for the next vertex, and we add the current vertex to the parent of the next vertex as normal. **In sum, all we are doing extra in Dijkstra's is factoring in the new edge weight and the distance from the starting vertex to the tree vertex it is adjacent to.**
The implication here is that Dijkstra's not only finds the shortest path from `s` to `e`, it also finds the shortest paths from `s` to _all other vertices in the graph_. By having to inspect all neighbors at every given step, **Dijkstra can map all shortest routes from all vertices**. Powerful stuff, but at a cost of `O(n^2)` since every vertex is compared to every other vertex.
### Floyd-Warshall Algorithm
With most of these graph problems so far, our examples lead us to pick vertices on the outer ends of the graph, much like how we start with the root node of a tree. But what if you wanted to start in the middle? What if you wanted to know the most centrally-located vertex in a graph? In fact, the first example I could think of is Sim City.
In Sim City, the "goal" (which I put in quotes because the game is open-ended and has no real objective ending) is to create a vibrant, happy city of people, or "sims." It's essentially one condensed simulation in urban planning. You have to provide people power for their homes, roads for them to travel to work (and places to work), and all of the amenities a local municipality needs like schools, police stations, and parks. But where do you place all of this stuff to make people happy?
For many of the buildings, like police stations, they can only operate in a certain radius to effectively stop crime before it's too late. Logically, if you put a police station on the edge of town and someone commits a crime on the other end, it's going to take more time for the police cars to arrive on the scene than if it were centrally located.
And since cities in Sim City can be quite large, it's not sufficient to just place one police station in the very middle of the map and hope for the best. You'll need several stations to cover the entire map. And your map, like the real world, is not simply a square grid of flat grass and plains. Natural features like rivers, oceans, and mountains can complicate how a station can effectively police an area.
Lucky for you, there is an algorithm called **Floyd-Warshall** that can objectively find the best spot to place your buildings by finding the _all-pairs shortest path_. In other words, at every vertex we can start from we find the shortest path across the graph and see how long it takes to get to every other vertex. Each time we start over, we keep score of the total moves required for each vertex. The shorest average distance will come from that central vertex, which we can calculate with an adjacency matrix. And since we are now adding another layer of checking every vertex on top of what is essentially Dijkstra's, this algorithm runs in `O(n^3)` time.
Floyd-Warshall doesn't actually produce a singular return value of the optimal location. Instead, it returns the distance matrix with all of the optimal paths mapped out, which is often sufficient enough for most problems of this scope. Even though cubic time may seem slow, the fact is this algorithm runs fast in practice, in part because it utilizes an adjacency matrix to handle the mapping of all of its distance values (one of the rare instances that we [originally mentioned](/blog/data-structures-for-graphs/) where an adjacency matrix is a better data structure than an adjacency list). It also helps that the algorithm is simple to implement, too:
```js
class AdjacencyMatrix {
constructor(size) {
// ES6 gives us a nice way of filling in a 2D array
this.weights = new Array(size).fill(
new Array(size).fill(Number.POSITIVE_INFINITY)
);
this.vertices = size;
}
floydWarshall() {
let distance;
for (let k = 0; k < this.vertices; k++) {
for (let i = 0; i < this.vertices; i++) {
for (let j = 0; j < this.vertices; j++) {
distance = this.weights[i][k] + this.weights[k][j];
if (distance < this.weights[i][j]) {
this.weight[i][j] = distance;
}
}
}
}
return this;
}
}
```
You can see from the triple-nested `for` loops very clearly that this is indeed a `O(n^3)` worst-case algorithm. This cost is acceptable for finding the all-pairs shortest path, but is also a good candidate for solving what are called _transitive closure_ problems. This is best explained with an example.
Who has the most power in a friend group? If mapped on a graph, you might think it's the center of the friend group, because he/she has the most immediate friends (i.e. the most direct connections to other people, or the vertex with the highest _degree_). In fact, it is the person _who has the farthest reach into the entire graph_.
The President of the United States is the most powerful person in the world not because he has the most friends, but because he has _the largest, most powerful network at his disposal_. He may not have everyone in his phone, but the people in his phone can eventually connect him to virtually anyone.
In fact, there's a popular phenomenon around this very concept of transitives closures called [Six Degrees of Kevin Bacon](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon). It asserts that Kevin Bacon is the most powerful celebrity because "he had worked with everybody in Hollywood or someone who’s worked with them."
To prove this statement true once and for all, you could plot every Hollywood celebrity on an adjacency matrix and map their relationships with each other as edges with weights for the strength of the relationship. If Kevin Bacon has the all-pairs shortest path to every other celebrity in Hollywood then this Wikipedia entry is not just a parlor game, but a true account!
## Onto the next Daily Problem
> Let `G = (V, E)` be an undirected weighted graph, and let `T` be the shortest-path spanning tree rooted at a vertex `v`. Suppose now that all the edge weights in `G` are increased by a constant number `k`. Is `T` still the shortest-path spanning tree from `v`?
## More problems to practice on
1. Problem 6-15.
2. Problem 6-17.
]]>2018-10-18T10:28:00+00:00Minimum Spanning Trees in JavaScript
https://www.adamconrad.dev/blog/minimum-spanning-trees-in-js/
Tue, 16 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/minimum-spanning-trees-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/dfs-and-topological-sort/), check that out first.
## Answers to the previous Daily Problem
> Your job is to arrange `n` ill-behaved children in a straight line, facing front. You are given a list of `m` statements of the form _“`i` hates `j`”_. If `i` hates `j`, then you do not want put `i` somewhere behind `j`, because then `i` is capable of throwing something at `j`.
We need to order kids in a line such that they don't rip each other to shreds. Seems reasonable.
> (a) Give an algorithm that orders the line, (or says that it is not possible) in `O(m + n)` time.
So what we want is a valid list out of a graph relationship where all of the students are facing forward (hint, hint: a directed acyclic graph). Solving a DAG to produce a valid list? This sounds exactly what a topological sort would provide.
Recall that topological sort on top of DFS is just including an extra stack, and that runtime is `O(V+E)`. Here we mention `n` as the vertices and `m` as the edge relationships between any two children, so we can indeed say that our DFS topological sort will run in `O(m + n)` time.
> (b) Suppose instead you want to arrange the children in rows such that if `i` hates `j`, then `i` must be in a lower numbered row than `j`. Give an efficient algorithm to find the minimum number of rows needed, if it is possible.
Since we're dealing with rows (or going across), this lends itself to being solved with BFS instead of DFS. Tracking the depth within the tree as you traverse it, and once you find the last tree node, you now know the maximum height of the tree, which therefore indicates the number of rows that are needed.
## Definition of a minimum spanning tree
A _spanning tree_ for a graph is the set of edges that connect to all vertices in the graph. In other words, **it's the edges that make the graph fully connected**. So you might think the minimum spanning tree is the minimum set of edges that connect a graph completely. In actuality, **the minimum spanning tree (MST) is the set of edges that connect all vertices in a graph with the smallest weight possible**.
When are MSTs useful? **Any time you need to find the minimum-weighted path on a route**, such as the shortest path on a road trip (we'll learn more about shortest path problems in the next article).
### Variants
There are a few variations of spanning trees that are similar to MSTs that are also worth noting:
1. **Maximum spanning trees.** As you can imagine, this looks for the path with the heaviest edges to connect the graph instead of the lightest (I'd imagine this would be great for something like Settlers of Catan). You'll see Prim's algorithm ahead for MSTs; to handle maximum spanning trees, just negate the weights to find the heaviest/longest path.
2. **Minimum product spanning trees.** Instead of finding the edge weights with the lowest sum, you want to find the lowest product. To do this, you just add up the logarithms of the weights of the edges instead of just the weights. Why would we do this? Likely for alternative routes that have lower absolute values, sort of like how you can choose the shortest route in time as opposed tot he shortest route in distance.
3. **Minimum bottleneck spanning trees.** This just is another condition of MSTs that we ensure the maximum edge weight is minimized. We can take care of this with Kruskal's algorithm a bit later.
4. **Steiner trees.** A minimum spanning tree with [intermediate midpoints](https://en.wikipedia.org/wiki/Steiner_tree_problem). Wiring routing and circuits deal with this, so if you do any hardware engineering, you might encounter a Steiner tree.
5. **Low-degree spanning tree.** If you're visiting all vertices using a [Hamiltonian path](https://en.wikipedia.org/wiki/Hamiltonian_path) you might construct a low-degree spanning tree, but you can't solve it with the algorithms we're about to show you. The low-degree part just ensures that we aren't visiting hub nodes with a lot of outbound routes, like when you reach a rotary with a lot of exits.
### Prim's Algorithm for MSTs
Prim's algorithm is one way to find a minimum spanning tree by starting with a vertex and progressively choosing the best neighboring vertex without regard to the entire structure of the graph. When you do something like choosing the minimum edge weight for a local set of edges without regard to finding the absolute minimum to the whole structure, that is called a _greedy algorithm_.
```js
class Graph {
// rest of structure from previous articles
let ADDED, DISTANCES, PARENTS;
prim(startVertex) {
ADDED = new Array(this.vertices.length).fill(false);
DISTANCES = new Array(this.vertices.length).fill(Number.POSITIVE_INFINITY);
PARENTS = new Array(this.vertices.length).fill(-1);
DISTANCES[startVertex] = 0;
let currentVertex = startVertex, currentEdge;
while (!ADDED[currentVertex]) {
ADDED[currentVertex] = true;
currentEdge = this.connections[currentVertex];
while (currentEdge) {
let nextVertex = currentEdge.adjacencyInfo;
let weight = currentEdge.weight;
if (DISTANCES[nextVertex] > weight && !ADDED[nextVertex]) {
DISTANCES[nextVertex] = weight;
PARENTS[nextVertex] = currentVertex;
}
currentEdge = currentEdge.nextVertex;
}
currentVertex = 1;
let bestCurrentDistance = Number.POSITIVE_INFINITY;
for (let i = 0; i < this.vertices.length; i++) {
if (!ADDED[i] && bestCurrentDistance > DISTANCES[i]) {
bestCurrentDistance = DISTANCES[i];
currentVertex = i;
}
}
}
}
}
```
Using our Big O notation tricks for calculating the runtime, we can see that we need to iterate `n` times for all vertices which sweep across only the minimum `m` edges (which is less than or equal to `n`) to connect the whole graph, which leaves our runtime for Prim's to be `O(n^2)`. Using a priority queue can drop this time down even further to `O(m + (n log n))` (note: can you figure out why?).
### Kruskal's Algorithm for MSTs
An alternative greedy algorithm to Prim's is called Kruskal's Algorithm, which just doesn't have a starting vertex. It instead builds up connections via components of vertices and, for sparse graphs, can run faster than Prim's, provided it uses the right data structure.
```js
class Graph {
let EDGE_PAIRS = new Array(MAX_VERTICES);
kruskal() {
let counter;
let set = new SetUnion(this.vertices);
this.toEdgeArray(); // using EDGE_PAIRS
// sort by weight rather than a standard array value
quicksort(this.EDGE_PAIRS, this.connections.length, this.EDGE_PAIRS.length);
for (let i = 0; i <= this.connections; i++) {
if (!sameComponent(set, EDGE_PAIRS[i].startVertex, EDGE_PAIRS[i].endVertex)) {
console.log(`edge ${EDGE_PAIRS[i].startVertex},${EDGE_PAIRS[i].endVertex} in MST`);
set.union(EDGE_PAIRS[i].startVertex, EDGE_PAIRS[i].endVertex);
}
}
}
}
```
To summarize the two:
1. _Are you trying to find an MST in a sparse graph?_ **Use Kruskal's Algorithm.**
2. _Are you trying to find an MST in a dense graph?_ **Use Prim's Algorithm.**
As you can see, the data structure for fast search of a sparse graph requires something called the Union-Finding data structure, which we will discuss next.
### Union Finding
When we're dealing with MSTs, we're thinking about navigating the graph in the shortest/quickest path possible. But to find the best way around our graph, we can't just look at our immediate neighbors to determine the best way forward. It's often helpful to cluster whole "neighborhoods" of nodes together to form a better picture about where we should move next.
As we say with Kruskal's Algorithm, one way to do that is by partitioning the graph into distinct sets and sorting against that. To make that happen efficiently, we utilized a `SetUnion` data structure which we will outline below. Set unions have two primary functions:
1. `areSame()` to determine if two vertices are in the same partition
2. `merge()` to merge two partitions together
Whereas traditional nodes in a tree focus on moving down the tree and keeping track of children, set unions have nodes who focus on moving _up_ by keeping a close watch on their parents. So if you have a graph of 3 groups like so:
```
0 3 6
| / \
1 2 4
\
5
```
You could represent the array of parents as `[0, 0, 3, 3, 3, 4, 6]`. A more complete example in JavaScript follows:
```js
class SetUnion {
constructor(length) {
this.length = length;
this.parents = [...Array(length).keys()];
this.elements = Array(length).fill(1);
}
find(element) {
if (this.elements[element] === element) return element;
return this.find(this.parents[element]);
}
areSame(set1, set2) {
return this.find(set1) == this.find(set2);
}
merge(set1, set2) {
const root1 = this.find(set1);
const root2 = this.find(set2);
if (root1 === root2) return "already same set";
if (this.elements[root1] >= this.elements[root2]) {
this.elements[root1] += this.elements[root2];
this.parents[root2] = root1;
} else {
this.elements[root2] += this.elements[root1];
this.parents[root1] = root2;
}
}
}
```
As you can see from above, `areSame()` and `merge()` rely on `find()` to recursively traverse the relavent subtree in the graph for the parent that matches. We know from [earlier](/blog/heapsort-priority-queues-in-js/) that recursive partitioning takes `O(log n)` time and so we can conclude that in the worst-case any set union operation will take logarithmic time, which is a great guarantee for Kruskal's Algorithm.
Now that we've explored a few algorithms for route finding, let's put some of this knowledge into practice with the Daily Problem.
## Onto the next Daily Problem
> Suppose we are given the minimum spanning tree `T` of a given graph `G` (with `n` vertices and `m` edges) and a new edge `e = (u, v)` of weight `w` that we will add to `G`. Give an efficient algorithm to find the minimum spanning tree of the graph `G + e`. Your algorithm should run in `O(n)` time to receive full credit.
## More problems to practice on
To get even more practice with graph searching and path finding, here are the other homework problems to go along with this article:
1. Implement an algorithm to print out the connected components in an undirected graph.
2. Problem 6-4.
3. Problem 6-5.
4. Problem 6-9.
]]>2018-10-16T10:28:00+00:00Depth-first Search & Topological Sort in JavaScript
https://www.adamconrad.dev/blog/dfs-and-topological-sort/
Thu, 11 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/dfs-and-topological-sort/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/bfs/), check that out first.
## Answers to the previous Daily Problem
> Prove that in a breadth-first search on a undirected graph `G`, every edge is either a tree edge or a cross edge, where `x` is neither an ancestor nor descendant of `y`, in cross edge `(x,y)`.
It helps to provide definitions to the term _tree edge_ and _cross edge_. If you are traveling from `(x,y)` and it's your first time visiting `y`, we say that this is a **tree edge**. As the example mentions above, a **cross edge** is on a path between two points that is neither the ancestor nor the descendant of `x` and `y`. It also helps to define what a _forward edge_ and a _backward edge_ are to give a complete picture.
All edges adhere to one of these four types. **Tree edges** only apply to newly-visited vertices. The other three apply to previously-visited verticies. If a visit already happens for the first time, we run through this simple ruleset:
1. If the `y` in this `(x,y)` relationship is an ancestor, it is a **backward edge**.
2. If the `y` is a descendant, it is a **forward edge**.
3. If `y` is neither, that's the **cross edge**.
So let's go through all four and eliminate by contradiction.
Assume `G` has a backward edge. If a backward edge exists, there is a cycle. If we have a cycle, we have to terminate the search after an edge is processed at most once (otherwise we would loop infinitely on our cycle and we could bounce back and forth between the two vertices involved in the cycle). That means that in this case, the only other possible edges are tree edges.
Assume `G` has a forward edge. If a forward edge exists, then `(y,x)` would have already been visited during our search before we reach `x` and try to visit `y`. Since you cannot visit the descendant before you visit the ancestor, no forward edges can exist in an undirected graph.
Assume `G` has a cross edge. If a cross edge exists, we're really saying that there are connections between siblings in the tree. BFS operates by going across each layer in the height of the tree. Even though those cross edges aren't formally defined in terms of real connections, they manifest in how our output array of BFS exploration looks.
By the process of elimination, the only edge types available left are tree edges, which works just fine. Since BFS is a graph-walking algorithm, the whole point is to visit every node. By visiting a node, those edges are tree nodes their first time around. Since we don't need to visit anything a second time via our queue, our proof is satisfied.
Now that we understand a bit more about ancestors and descendants, this should make our implementation of Depth-First Search a bit clearer (Hint: DFS only has tree edges and back edges).
## Depth-First Search: Like BFS but with a stack
As I mentioned earlier, the real difference between how BFS and DFS walk a graph is in the data structure they use to make that walk. **BFS uses a queue and DFS uses a stack.** This means that there is some backtracking involved; the algorithm will go as far down the children as it can before it turns back to dive down the next stack of successive child nodes. **This tracking is what makes DFS so powerful.**
When remembering depth-first search, just remember that **we're going to visit nodes in a top-to-bottom fashion.** So a tree that looks like this:
```
8
/ \
6 10
/ \ \
4 5 12
```
Will read the tree with DFS in this order: `[8, 6, 4, 5, 10, 12]`. The generic algorithm for all graphs (not just trees) using DFS looks like this:
```js
class Graph {
// ... see last article for our implementation
let count = 0;
const ENTRY_TIMES = new Array(MAX_VERTICES);
const EXIT_TIMES = new Array(MAX_VERTICES);
dfs(currentVertex) {
if (finished) return;
let nextVertex;
VISITED[currentVertex] = true;
count += 1;
ENTRY_TIMES[currentVertex] = count;
console.log("PRE-PROCESSED!");
tempVertex = this.connections[currentVertex];
while (tempVertex) {
nextVertex = tempVertex.adjacencyInfo;
if (!VISITED[nextVertex]) {
PARENTS[nextVertex] = currentVertex;
console.log(`PROCESSED EDGE ${currentVertex}=>${nextVertex}`);
this.dfs(nextVertex);
} else if (
(!PROCESSED[nextVertex] && PARENTS[currentVertex] !== nextVertex) ||
this.isDirected
) {
console.log(`PROCESSED EDGE ${currentVertex}=>${nextVertex}`);
if (finished) return;
tempVertex = tempVertex.nextVertex;
}
}
console.log("POST-PROCESSED");
count +=1;
EXIT_TIMES[currentVertex] = count;
PROCESSED[currentVertex] = true;
}
}
```
### Why use depth-first search?
So now we know the _what_ and the _how_ of DFS, but _why_ should we care to use it? Here's a couple reasons:
* **Cycle detection.** As we mentioned from the previous Daily Problem, cycles can occur with back edges. Back edges are very easy to detect in DFS because backtracking is built into the algorithm. How long will this take? Only `O(n)` because it will take `n` vertices to find a tree node that has to head back up to an ancestor (via a back edge). Since `n` vertices need to be explored, that's only `n-1` edges to visit to find that back edge and thus the cycle. This reduces down to the worst-case `O(n)` to find the cycle in the graph.
* **Dead ends and cutoffs.** Any vertices where cutting them off disconnects the graph is called an **articulation vertex**. DFS can find these in linear time (because of the ability to look back on a parent node to see if connectivity still exists) while BFS can only do this in quadratic time.
### DFS for directed graphs: Topological sort
When graphs are directed, we now have the possibility of all for edge case types to consider. Each of these four cases helps learn more about what our graph may be doing. Recall that if no back edges exist, we have an acyclic graph. Also recall that directed acyclic graphs (DAGs) possess some interesting properties.
The most important of them is that for a certain configuration, you can represent a DAG as a list of nodes in a linear order (like you would a linked list). Such a configuration (of which more than one can exist for certain DAGs) is called a **topological sort**. Topological sorting is important because it proves that you can process any vertex before its successors. Put it another way, **we can find efficient shortest paths because only relevant vertices are involved in the topological sorting**.
```js
class Graph {
topologicalSort() {
for (let i = 0; i < this.vertices; i++) {
if (!VISITED[i]) {
// the console logs from dfs() as we pop off stack
// will print a topological sort order
this.dfs(i);
}
}
}
}
```
## Onto the next Daily Problem
Now that we've covered the basics of graph searching, be sure to study this and the previous article to compare and contrast why these implementations work and how they're useful in different situations. Given your knowledge of both now, this next Daily Problem should give you a good idea of how to solve each part:
> Your job is to arrange `n` ill-behaved children in a straight line, facing front. You are given a list of `m` statements of the form _“`i` hates `j`”_. If `i` hates `j`, then you do not want put `i` somewhere behind `j`, because then `i` is capable of throwing something at `j`.
>
> (a) Give an algorithm that orders the line, (or says that it is not possible) in `O(m + n)` time.
>
> (b) Suppose instead you want to arrange the children in rows such that if `i` hates `j`, then `i` must be in a lower numbered row than `j`. Give an efficient algorithm to find the minimum number of rows needed, if it is possible.
## More practice problems
To wrap up this chapter here are the other homework problems to go along with these articles:
1. Problem 5-12.
2. Problem 5-13.
3. Problem 5-14.
4. Problem 5-19.
5. Problem 5-25.
Think you've got the answers? Let's see how you do in the next article!
]]>2018-10-11T10:28:00+00:00An implementation of Breadth-First Search in JavaScript
https://www.adamconrad.dev/blog/bfs/
Thu, 04 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/bfs/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/data-structures-for-graphs/), check that out first.
## Answers to the previous Daily Problem
> Present correct and efficient algorithms to convert an undirected graph `G` between the following graph data structures. You must give the time complexity of each algorithm, assuming `n` vertices and `m` edges.
Basically we're being asked to take a graph and morph it between the data structures we learned about in the last article.
> (a) Convert from an adjacency matrix to adjacency lists.
In this case we're going from the space-heavy to the space-efficient, and mapping the relationships. In this case we iterate through every row in our 2D array, which represents each node (read: vertex) in our list. Then all we have to do is, for each node, add the `next` nodes for each pointer that is activated (read: the `1`s in the matrix). That's an `O(n^2)` algorithm since we're doing a constant amount of work for each assignment and we have to compare the `n` columns against the `n` rows.
> (b) Convert from an adjacency list to an incidence matrix. An incidence matrix `M` has a row for each vertex and a column for each edge, such that `M[i,j] = 1` if vertex `i` is part of edge `j`, `otherwise M[i,j] = 0`.
This one kind of gives it away by saying `for each` twice, since we're walking down the vertices and comparing against the other vertex on that edge giving us an `O(n+m)` algorithm.
> (c) Convert from an incidence matrix to adjacency lists.
This one is actually interesting. We know we need to iterate across all edges, so that's `O(m)` immediately. The difference is that for every `1` we find in an incidence matrix, we need to find the 2 `n` vertices that represent the two ends of that edge, therefore we need to make 2 operations for each vertex. So this conversion is actually `O(2nm)`, which reduces down to `O(nm)` because as we learned back with our article on [Big O notation](/blog/big-o-notation/), the constants on any variable are dropped.
## Graph traversal definitions
Before we dive into the various traversal algorithms, we need to introduce how we track our traversal. This is the same across all algorithms, but we want to define these states since we'll be using them quite a bit:
* **Unvisited:** You haven't visited this vertex in the graph yet.
* **Visited:** You _have_ visited this vertex.
* **Processed:** You not only visited this vertex, but you've travelled down all of its edges, too.
## Breadth-First Search
So now that we've described some definitions we'll use, let's look at our first graph traversal algorithm: **breadth-first search** (BFS for short). Later we'll look at **depth-first search**, so remove the confusion now, I want you to think on how you describe something by its _breadth_ versus its _depth_. How do you physically describe it in space?
_Breadth_ you probably think in a **left-right relationship**, like extending your arms out to show something broad and wide. The data structure that makes us think of shuffling left-to-right? **A queue.**
_Depth_ you probably think in a **up-down relationship**, like the _depths_ of the sea. The data structure that makes us think of diving down and coming back up? **A stack.**
So when remembering breadth-first search, just remember that **we're going to visit nodes in a left-to-right fashion.** So a tree that looks like this:
```
8
/ \
6 10
/ \ \
4 5 12
```
Will read the tree with BFS in this order: `[8, 6, 10, 4, 5, 12]`. The generic algorithm for all graphs (not just trees) using BFS looks like this:
```js
class Graph {
// ... see last article for our implementation
const PROCESSED = new Array(MAX_VERTICES);
const VISITED = new Array(MAX_VERTICES);
const PARENTS = new Array(MAX_VERTICES);
bfs(startVertex) {
let visitQueue = new Queue();
let currentVertex, nextVertex, tempVertex;
visitQueue.enqueue(startVertex);
visited[start] = true;
while (visitQueue.length > 0) {
currentVertex = visitQueue.dequeue();
console.log("PRE-PROCESSED!");
PROCESSED[currentVertex] = true;
tempVertex = this.connections[currentVertex];
while (tempVertex) {
nextVertex = tempVertex.adjacencyInfo;
if (!PROCESSED[nextVertex] || this.isDirected) {
console.log(`PROCESSED EDGE ${currentVertex}=>${nextVertex}`);
}
if (!VISITED[nextVertex]) {
visitQueue.enqueue(nextVertex);
VISITED[nextVertex] = true;
PARENTS[nextVertex] = currentVertex;
}
tempVertex = tempVertex.nextVertex;
}
console.log("POST-PROCESSED");
}
}
}
```
The runtime here is `O(n + m)`. Why is that? When using an adjacency list, we're going to explore all of the edges for each vertex. Every vertex will get processed once in the queue. Every edge is going to get walked on both coming and going. That makes the total number of operations `n + 2m`, so when we reduce that down for worst-case runtime, that leaves us with `O(n + m)`.
### Why use breadth-first search?
So now we know the _what_ and the _how_ of BFS, but _why_ should we care to use it? Here's a couple reasons:
* **Visualize clustered relationships.** A _cluster_ is just a subset of the graph that is connected. Remember, not all vertices need to be connected to each other. Since BFS doesn't care for an absolute root, it will pick any vertex, find all of its connections, and then keep discovering all of the explicitly-defined vertices that haven't already been discovered via connection. BFS is great for mapping people and neighbors/friends across the world. There's going to be a cluster in Australia. There's going to be a cluster in Japan. Each of these islands creates a physical cluster of relationships. They're all part of the total graph of people in the world, but they aren't all necessarily connected. The best part is, no matter how much stuff we do to find connected components, it only ever adds constant time to find clusters, so a connected clustering algorithm still runs in `O(n + m)`.
* **The Two-Color problem.** This is one of those specific graph theory math problems that's great for BFS. Each vertex gets a label (in this case, a color). The edges of said vertex cannot lead to another vertex of the same color. So `R->G->R` passes, but `R->R->B` fails. The valid graph is known as a _bipartite_ graph; _bi-_ meaning two and _partite_ meaning to divide. There are lots of real-world examples in [biology and medicine](https://academic.oup.com/gigascience/article/7/4/giy014/4875933). Have you ever played Sudoku? A Sudoku [solver](http://www.cs.kent.edu/~dragan/ST-Spring2016/SudokuGC.pdf) is just an application of graph coloring! In fact, **any sort of tree is a bipartite graph** because the whole purpose of the tree is to _divide_ the nodes involved.
## The next Daily Problem
So there's a few clear use cases for BFS. We'll find quite a few more applications for DFS in the next article, but for now, let's think on what we've learned from BFS with a Daily Problem:
> Prove that in a breadth-first search on a undirected graph `G`, every edge is either a tree edge or a cross edge, where `x` is neither an ancestor nor descendant of `y`, in cross edge `(x,y)`.
## More problems to practice on
Now that [homework 3](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw3.pdf) is out, here are a few problems that are relevant to the sorting we've discussed so far:
1. Problem 5-4.
2. Problem 5-7.
That's all for today, stay tuned for depth-first search in the next article!
]]>2018-10-04T10:28:00+00:00How JavaScript Engines Work
https://www.adamconrad.dev/blog/how-javascript-engines-work/
Wed, 03 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/how-javascript-engines-work/2018-10-03T10:28:00+00:00Data structures for graphs in JavaScript
https://www.adamconrad.dev/blog/data-structures-for-graphs/
Tue, 02 Oct 2018 10:28:00 +0000https://www.adamconrad.dev/blog/data-structures-for-graphs/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/algorithm-behind-js-array-sort/), check that out first.
Graphs are the grand unifier of computers, both literally and figuratively. Graphs are the most interesting problems in computer science because they reflect the real-world: networks, communication, and organization all operate in the abstract as a graph. In fact, your computer is reading this article as part of the grand graph known as the internet. Graphs are the analogy for connectivity, so when we study graphs, **we are studying how the people of the world interact with each other.**
As we mentioned in a [previous article about designing algorithms](/blog/how-to-design-an-algorithm/), when we describe graphing problems, we often use words like _network_, _circuit_, _web_, and _relationship_. When describing problems with properties of those words, the solution is often an algorithm of navigating a graph. More important than knowing solutions to graph problems is **recognizing the kinds of graphs, their data structures, and the kinds of problems they run into**.
## Kinds of graphs
There are a _lot_ of different kinds of graphs. **A graph is just a collection of nodes (or vertices) and the edges that connect them.** Mathematics succinctly defines a graph as `G = (V,E)`, so just like we use [Big O notation](/blog/big-o-notation/) to describe runtimes, so too will we use this graphing notation. But what kinds of graphs will we be dealing with?
* **By connectivity.** _Connected graphs_ have all vertices connected to each other. _Disconnected graphs_ have some vertices that have no edges connecting them. Bittorrent networks are connected graphs since each vertex offers a portion of the files they share. Bittorrent networks can also be disconnected graphs if someone has a file that isn't being shared (or desired) by any other computer on the network.
* **By direction.** _Directed graphs_ will have edges that force you to move from one vertex to the other in a certain direction. _Undirected graphs_ let you freely travel on an edge in either direction. Two-way streets are undirected. Arteries (vessels bringing blood away from the heart) and veins (vessels bringing blood back to the heart) are directed.
* **By weight.** _Weighted graphs_ put a price on the edges you travel between vertices. _Unweighted graphs_ have every edge cost the same. It costs more time (and money) to travel from Boston to Seattle than it does to travel from San Francisco to Los Angeles. It never costs anything to walk from your home to the post office and back, and that distance is always the same.
* **By simplicity.** _Simple graphs_ connect to other nodes just once. _Non-simple graphs_ can connect to other nodes with multiple edges, _or even themselves_. Tunnels, like subway systems, are simple graphs because each tunnel edge is meant to get from one place to the other. Roads, on the other hand, are non-simple because there are so many ways to get from one building to the next.
* **By density.** _Dense graphs_ have a lot of edges connecting vertices to each other. _Sparse graphs_ have only a few edges connecting vertices. Unlike the other examples, this distinction is kind of vague, but if you can describe the connections in a _quadratic_ order (i.e. every node connected to each other) it's dense. If you can describe the connections in a _linear_ order (i.e. every node is one other node or its neighbors) it's sparse. _Most problems deal with sparse graphs_. Roads are sparse graphs since each intersection is usually only the crossing of two streets. Broadcast networks are dense since a broadcast has to reach all of its subscribers.
* **By cycles.** _Cyclic graphs_ will repeat back to one origin vertex at some point. Conversely, _acyclic graphs_ do not. Tree data strutures can be described as _connected, undirected, acyclic graphs_. Adding direction to those graphs (also know as _DAGs_ for short) gives them a _topology_ (i.e. relation or arrangement) between each other. Scheduling problems where _A_ must come before _B_ are represented as DAGs and can be reasoned about with topological sorting algorithms, which we'll learn about a bit later.
* **By topology.** _Topological graphs_ have vertices that are related by how their edges are portrayed. For example, if points on a map are shaded and arranged by their distance and height like on a real topography map, that creates a topological graph. Otherwise, _embedded graphs_ represent projections of how real graphs are connected. Any drawing of a graph is an example of an embedded graph.
* **By implication.** _Implicit graphs_ have some murky details. They don't fully describe all of the connections in a graph. In fact the traversal of such a graph is how we may uncover a graph to make it explicit. _Explicit graphs_ on the other hand clearly outline all of the vertices and all of the edges from the beginning. If you've ever played Super Metroid, the [map of planet Zebes](http://www.snesmaps.com/maps/SuperMetroid/SuperMetroidMapZebes.html) starts as an implicit graph. The connection between zones is not known until you explore them. Only when you've played the whole game can you create an explicit graph of the game map.
* **By labeling.** _Labeled graphs_ give names or keys to each vertex, while _unlabeled graphs_ do not make that distinction. The fact that we can name a path between Boston and Seattle means that the graph of roads between cities (the cities being the label on each vertex) is a labeled graph.
## Data structures for graphs
Now that we know the kinds of graphs, how do we represent them in the memory of a computer? A few such data structures arise:
* **Adjacency matrix**. If there are `n` vertices in a graph, we can create an `n`x`n` matrix (i.e. an array of arrays) which represents how each vertex on a row is connected to each vertex on a column. A cell with a value of `1` means the row vertex is connected to the column vertex. Otherwise, the cell has a value of `0`.
* **Adjacency list**. A more space-efficient way to represent a sparse graph is with a linked list, where the pointers represent the edges that connect neighboring vertices to each other.
So we have two major data structures for storing the data of graph problems. How do you know when to use each?
### When adjacency matrices are a better fit
* **Testing if an edge is in a graph.** We said indexing from arrays is faster than lists. This is just a multi-dimensional application of this concept.
* **When you have really big graphs.** When space isn't a concern, indexing directly into a graph's relationships is going to be faster than having to traverse a list to find vertices deep within a graph.
* **When you're focused on writing connections (e.g. insertions and deletions of edges) to the graph.** Since we've represented every connection, it's as simple as toggling a `0` to a `1` which occurs in constant time.
### When adjacency lists are a better fit
* **Finding the degree of a vertex.** The _degree_ just means the number of connections to another vertex. Since a linked list inherently stores the pointers to other nodes, it's trivial to find the degree by simply summing the total number of pointers at a given vertex, which takes a constant-time number of operations. A matrix, however, requires summing the number of `1`s in a given row for a vertex of relationships, which involves traversing that array which takes linear time instead of constant time.
* **When you have small graphs.** Conversely to a matrix, with small graphs the relationships are simple to map as pointers in a list, which are going to be very fast to access, and much more space efficient.
* **When you're focused on reads and traveling around (i.e. traversing) a graph.** Since the data structure of a linked list bakes in relationships between nodes, this is naturally a good data structure for navigating from vertex to vertex. Matrices, on the other hand, verify relationships but offer no hint of where to go next, so it's a chaotic mess that will take at most `O(n^2)` calls on average while a list will do that in linear time.
Since most problems focus on solving these kinds of issues, **this is usually the right data structure for solving most graphing problems.**
Let's see how these might look in JavaScript:
```js
class Vertex {
constructor(adjacencyInfo = null, nextVertex = null, weight = null) {
this.adjacencyInfo = adjacencyInfo;
this.nextVertex = nextVertex;
this.weight = weight;
}
}
class Graph {
const MAX_VERTICES = 1000; // can be any number
constructor(hasDirection) {
this.connections = new Array(MAX_VERTICES).fill(null);
this.degrees = new Array(MAX_VERTICES).fill(0);
this.edges = 0;
this.isDirected = hasDirection;
this.vertices = 0;
}
insert(vertexStart, vertexEnd) {
let vertex = new Vertex(vertexEnd, this.connections[vertexStart]);
this.connections[vertexStart] = vertex;
this.degrees[vertexStart] += 1;
if (this.isDirected) {
this.edges += 1;
} else {
this.insert(vertexEnd, vertexStart);
}
}
print() {
for (let i = 0; i < this.vertices; i++) {
console.log(`${i}: `);
let connection = this.connections[i];
while (connection) {
console.log(` ${connection.adjacencyInfo}`);
connection = connection.next;
}
console.log('\n'); // new line for clarity
}
}
}
```
## The next Daily Problem
To get us thinking a bit about graphs, our next Daily Problem is Problem 5-8 from _The Algorithm Design Manual_:
> Present correct and efficient algorithms to convert an undirected graph `G` between the following graph data structures. You must give the time complexity of each algorithm, assuming `n` vertices and `m` edges.
>
> (a) Convert from an adjacency matrix to adjacency lists.
>
> (b) Convert from an adjacency list to an incidence matrix. An incidence matrix `M` has a row for each vertex and a column for each edge, such that `M[i,j] = 1` if vertex `i` is part of edge `j`, `otherwise M[i,j] = 0`.
>
> (c) Convert from an incidence matrix to adjacency lists.
Think you have it? Have questions? Send it over [on Twitter](https://twitter.com/theadamconrad) and I'd be happy to help, and good luck!
]]>2018-10-02T10:28:00+00:00Verifying request signatures in Elixir/Phoenix
https://www.adamconrad.dev/blog/verifying-request-signatures-in-elixir-phoenix/
Sun, 30 Sep 2018 09:10:00 +0000https://www.adamconrad.dev/blog/verifying-request-signatures-in-elixir-phoenix/ get_req_header("x-slack-request-timestamp")
|> Enum.at(0)
|> String.to_integer()
```
The local timestamp is kind of weird. The easiest way is to leverage the Erlang API but the numbers start with the start of the Gregorian calendar (so roughly 2018 years ago) rather than the standard UNIX timestamp (Jan 1, 1970). So you'll just have to subtract that difference, which is an oddly specific magic number:
```elixir
@unix_gregorian_offset 62_167_219_200
gregorian_timestamp =
:calendar.local_time()
|> :calendar.datetime_to_gregorian_seconds()
local_timestamp = gregorian_timestamp - @unix_gregorian_offset
```
Now you just need to take the absolute value of the difference and make sure it's within some reasonable delta (in the case of the tutorial, it's 5 minutes or 300 seconds):
```elixir
if abs(local_timestamp - timestamp) > 300 do
# nothing / return false
else
# process request / return true
end
```
### 3. Concatenate the signature string
This is the easiest step because interpolated strings are just as easy as you think they are, and now that you have the raw request body, grabbing this will be trivial:
```elixir
sig_basestring = "v0:#{timestamp}:#{conn.assigns[:raw_body]}"
```
### 4. Hash the string with the signing secret into a hex signature
Now you need a signature to compare against the one Slack sends you along with the timestamp. Erlang provides a nice crypto library for computing [HMAC-SHA256 keyed hashes](https://en.wikipedia.org/wiki/HMAC), you'll just need to turn that into a hex digest (using `Base.encode16()`):
```elixir
my_signature =
"v0=#{
:crypto.hmac(
:sha256,
System.get_env("SLACK_SIGNING_SECRET"),
sig_basestring
)
|> Base.encode16()
}"
```
### 5. Compare the resulting signature to the header on the request
The light is at the end of the tunnel! There are a few more gotchas that aren't exactly intuitive:
1. **`get_req_header` returns an array, even though the same sounds like it returns the singular value**
2. **Leverage the `Plug.Crypto` library to do secure signature comparisons**
As you may have seen from earlier code, `get_req_header` takes in your `conn` and the string of the request header key to return the value...as an array. Not sure why but it's easy to remedy with pattern matching.
Finally, to achieve the `hmac.compare` pseudocode from the Slack tutorial, Elixir has an equal `Plug.Crypto.secure_compare`:
```elixir
[slack_signature] = conn |> get_req_header("x-slack-signature")
Plug.Crypto.secure_compare(my_signature, slack_signature)
```
## Putting it all together
Now that we've solved all the tutorial discrepancies, it's time to put it all together in the context of a Phoenix application.
We've already added the custom raw request body header library and modified our endpoint to accept this `Plug.Parser`. Now we just need to bring the verification into our API controller endpoint you specify within your Slack app dashboard:
```elixir
defmodule MYAPPWeb.SlackController do
use MYAPPWeb, :controller
@doc """
Handle when a user clicks an interactive button in the Slack app
"""
def actions(conn, params) do
%{
"actions" => actions,
"team" => %{"id" => team_id},
"type" => "interactive_message",
"user" => %{"id" => user_id}
} = Poison.decode!(params["payload"])
if verified(conn) do
# do your thing!
conn |> send_resp(:ok, "")
else
conn |> send_resp(:unauthorized, "")
end
end
defp verified(conn) do
timestamp =
conn
|> get_req_header("x-slack-request-timestamp")
|> Enum.at(0)
|> String.to_integer()
local_timestamp =
:calendar.local_time()
|> :calendar.datetime_to_gregorian_seconds()
if abs(local_timestamp - 62_167_219_200 - timestamp) > 60 * 5 do
false
else
my_signature =
"v0=#{
:crypto.hmac(
:sha256,
System.get_env("SLACK_SIGNING_SECRET"),
"v0:#{timestamp}:#{conn.assigns[:raw_body]}"
)
|> Base.encode16()
}"
|> String.downcase()
[slack_signature] = conn |> get_req_header("x-slack-signature")
Plug.Crypto.secure_compare(my_signature, slack_signature)
end
end
end
```
Congratulations! You can now securely talk with Slack by verifying it's header signature against the one you've generated by hashing your signing secret with the current timestamp. The Slack API documentation is thorough and helpful, but translating it to work in Elixir/Phoenix is not as intuitive as you might imagine.
]]>2018-09-30T09:10:00+00:00The algorithms behind the Sort function in JavaScript
https://www.adamconrad.dev/blog/algorithm-behind-js-array-sort/
Tue, 25 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/algorithm-behind-js-array-sort/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/heapsort-priority-queues-in-js/), check that out first.
## Daily Problem
> You are given a collection of `n` bolts of different widths, and `n` corresponding nuts. You can test whether a given nut and bolt together, from which you learn whether the nut is too large, too small, or an exact match for the bolt. The differences in size between pairs of nuts or bolts can be too small to see by eye, so you cannot rely on comparing the sizes of two nuts or two bolts directly. You are to match each bolt to each nut.
Basically this question is asking "does every bolt have a matching nut?"
> 1. Give an `O(n^2)` algorithm to solve the above problem.
The obvious play here seems to be to test every bolt with every nut, which would look like this:
```js
function match(nuts, bolts) {
let pairs = [];
for (nut in nuts) {
for (bolt in bolts) {
if (nut.fits(bolt) === "exact match") pairs.push([nut, bolt]);
}
}
return pairs;
}
```
There are `n` nuts in a `for` loop checking every `n` bolts in a nested `for` loop, so we can easily determine from our previous discussion on calculating Big O quickly that this algorithm will run in `O(n^2)` time.
> 2. Suppose that instead of matching all of the nuts and bolts, you wish to find the smallest bolt and its corresponding nut. Show that this can be done in only `2n − 2` comparisons.
We start out with `n` nuts and `n` bolts. Start comparing two at random. Likely they will not be the same size, and we will get feedback on that. If something is `too small` for the other, then we keep that item and toss the other piece. So if the nut was `too small` for the bolt, throw away the bolt, grab another one, and then do another comparison. Essentially you will be ping-ponging between nuts and bolts, throwing away all but the last nut and bolt. Since there are `2n` total items, and we throw away all but the last nut (`1`) and the last bolt (`1`), which is `2n-2` comparisons.
What do we do if there is the same? Keep one of either the nut or the bolt aside and do another comparison. If the new nut/bolt is smaller than both, we have our new minimum item and we can throw away both. Regardless, we will get a signal from the new item that will eliminate the kept item or the new item drawn.
> 3. Match the nuts and bolts in expected `O(n log n)` time.
This was one of those Daily Problems that would benefit from reading ahead a bit. In fact, the answer to these 2 parts will become very clear once we get through a few of the algorithms from this article.
## Mergesort: The divide-and-conquer algorithm behind Firefox array sorting
As I mentioned earlier, there is an algorithm that is so good it is used to power the web browser you may be viewing this blog post on. This algorithm is called _mergesort_, and it [was the preferred algorithm](https://bugzilla.mozilla.org/show_bug.cgi?id=224128) for array sorting in Firefox's JavaScript engine over heapsort.
Mergesort works by taking advantage of an essential problem solving technique called **divide-and-conquer**. Practically speaking, divide-and-conquer involves **breaking a problem up into a smaller problem to make it solvable.**
The way we implement divide-and-conquer with mergesort is to _use the power of recursion to halve our problem space_ until all that's left is a simple two item comparison. Once we compare and sort the subarray with two items, we build those lists back up, _merging_ these subarrays together until we have unified the whole array again, sorting our items as we build the list back up.
```js
function merge(leftArr, rightArr) {
let mergedArr = [];
while (leftArr.length || rightArr.length) {
if (leftArr[0] <= rightArr[0]) {
mergedArr.push(leftArr.shift());
} else {
mergedArr.push(rightArr.shift());
}
}
while (leftArr.length) leftArr.shift();
while (rightArr.length) rightArr.shift();
return mergedArr;
}
function mergesort(arr) {
return merge(
mergesort(arr.slice(0, arr.length / 2)),
mergesort(arr.slice(arr.length / 2, arr.length))
);
}
```
### Why use mergesort over heapsort
Truth be told, **mergesort is a very good general sorting algorithm**. Like heapsort, mergesort runs in `O(n log n)` time. If you have the space to copy the array into the sorted array (which you usually do if you aren't in constrained environments like graphics programming or IoT devices), mergesort is great. So if it has the same worst-case runtime as heapsort, why would you choose it over heapsort?
**Mergesort is also excellent for linked lists.** The `merge()` operation of mergesort can be implemented without requiring extra space.
**Mergesort is stable.** The only swapping of items happens when you actually sort within mergesort. In heapsort, since we're using a priority queue, the structure itself is unstable because it's frequently being rearranged to ensure the maximum/minimum value remains at the root of the heap.
## Quicksort: The randomization algorithm powering Chrome's V8 and IE's Chakra
It turns out there are other excellent algorithms for sorting. One such algorithm uses the power of randomization to arbitrarily find a pivot point to sort against. We can use that pivot point recursively just like we do with mergesort to continuously sort smaller and smaller problems until the entire list of items has been sorted. This algorithm is called **quicksort**, and it's used by both [Google](https://github.com/v8/v8/blob/master/src/js/array.js#L328) and [Microsoft](https://github.com/Microsoft/ChakraCore/blob/master/lib/Runtime/Library/JavascriptArray.cpp#L6701) for sorting.
Quicksort and mergesort have quite a bit in common. They both build a nested recursion tree all the way down to every `n` items within a list, each of which takes linear time to process. That means they both generally run in `O(nh)`, where `h` is equal to the height of the recursion tree that is generated.
Where they differ lies in the randomization of the pivot point. Whereas with mergesort we always subdivide our arrays evenly, in quicksort that point is never known until the function is run.
In the best case, we pivot on the median point, halving the array each time leading us to a runtime of `O(log n)` and is on par with heapsort and mergesort. In the worst case, we pivot on the end of the array with a completely unbalanced tree that only ever sorts 1 item at a time giving us runtime of `O(n^2)` and is on par with selection sort.
So if the worst case is so bad, why would we bother using quicksort? The answer lies in the average case random sampling.
With mergesort the best and only case is the exact median of the list. If the list isn't perfectly _unsorted_, you could be suboptimally generating your recursion sorting tree.
Similarly, with premeditated data, we know that we can find a configuration that gets us the best- and worst-case scenarios. But since unsorted data can be in any random order, how does quicksort help us?
With quicksort, the middle _half_ of the list (e.g. for `[1,2,3,4]`, that's `[2,3]`) is a good enough pivot to not hit the worst-case runtime. The reason is a pivot somewhere in the middle still generates a recursive tree that is some logarithmic order of magnitude (specifically `2n ln n`), the kind of speed we're looking for.
To solidify a `O(n log n)` runtime, we grab a _random_ permutation of our initial list which takes `n` time. And since _random_ data can be sorted on average in `O(log n)`, adding in this randomization step _guarantees_ we can provide an `Θ(n log n)` runtime with _any_ input.
```js
function partition(arr, pivotIndex, lower, higher) {
let pivotPoint = arr[pivotIndex];
let partitionIndex = lower;
for (let i = lower; i < higher; i++) {
if (arr[i] < pivotPoint) {
[arr[i], arr[partitionIndex]] = [arr[partitionIndex], arr[i]];
partitionIndex++;
}
}
[arr[right], arr[partitionIndex]] = [arr[partitionIndex], arr[right]];
return partitionIndex;
}
function quicksort(arr, lower, higher) {
let pivotIndex = higher;
let partitionIndex = partition(arr, pivotIndex, lower, higher);
quicksort(arr, lower, partitionIndex - 1);
quicksort(arr, partitionIndex + 1, higher);
}
```
### Why use quicksort
Now we have three algorithms with an expected runtime of `O(n log n)`. Why would we choose quicksort?
**Quicksort performs in-place sorting.** Like heapsort, quicksort requires no extra space.
Mergesort requires an additional array to copy your merged items into. If you have space constraints, quicksort beats mergesort.
**Quicksort beats mergesort for arrays.**
Since quicksort takes advantage of random access, we already know that the constant-time access of arrays by index makes it very easy to use a random number and retrieve an item quickly. With linked lists, the random access no longer pays off, since all items must be accessed from either the front or the back of the list.
**Quicksort is great for sorting arrays with hundreds of thousands of unique keys**.
In fact, when benchmarked against 100,000 unique keys or more, [quicksort outperformed mergesort and heapsort by 1.5-2x](http://alejandroerickson.com/j/2016/08/02/benchmarking-int-sorting-algorithms.html).
### Revisiting the Daily Problem
With the available algorithms at our disposal, this should make the Daily Problem feasible to answer beyond the naive first section.
> Match the nuts and bolts in expected `O(n log n)` time.
With our randomized quicksort algorithnm in place, we can now sort both nuts and bolts by size, and both sets will hvae a pairing because the first element of one matches the first element of the other, and so on all the way up to the `n`th element. If this still doesn't make sense, check out an [in-depth analysis of randomized quicksort](https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0906.pdf).
## Bucketsort: Sorting via distribution
With three sorting algorithms in our optimal band of `O(n log n)` already, how could we possibly get any better? Consider the following example:
Rather than a list of numbers, you now have a list of names, like in your iPhone. How do you find a name in your phone? If you scroll really quickly, you'll see a letter highlighting roughly which letter of the alphabet you're in. From there you'll pretty much go through the whole alphabet starting with the second letter of the name until you find the person you're looking for. _This is exactly how bucketsort works (also known as distribution sort or binsort).
**Bucketsort is great for evenly-distributed chunks of similar data.** If we used bucketsort in the above example, we would first partition the whole phone book into 26 subarrays (well maybe a few less because letters like _X_ and _Z_ have far fewer names than _M_ or _S_).
I hope you're seeing a theme here. We once again are _subdividing the problem_; instead of dividing equally in halves or by a random pivot, we branch a list of `n` items by `k` logical partitions based on how the data is arrange. Now obviously we can't implement a generic algorithm for bucketsort in JavaScript because it highly depends on the dataset. Bucketsort for last names will look very different from bucketsort on tabular accounting data for a business.
## Binary search: The algorithm that powers BSTs
Now that we have a gaggle of sorting algorithms to choose from, what can we do with this sorted data to ensure we're still operating efficiently? **Binary search is the `log n` application for searching sorted data.**
All we have to do is start with the middle element. If it's not the middle element, we ask "is our target item less than or greater than the middle element?" We then choose a side and repeat the process until our target element is found:
```js
function binarySearch(arr, target) {
let middle = Math.floor(arr.length / 2);
if (arr[middle] === target) return middle;
if (arr[middle] < target) {
binarySearch(arr.slice(middle + 1), target);
} else {
binarySearch(arr.slice(0, middle - 1), target);
}
}
```
### Meta binary search
What happens if you have an unbounded sorted array? For example, we can create an unbounded array using [ES6 generators](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/function*). The fibonacci sequence is unbounded, and can be lazily evaluated step-by-step using generators. How would we find an item within such an array?
**Meta binary search**, or one-sided binary search, to the rescue! This search is twice is slow because it requires two `log n` searches. The first search is find our element within some bounds. So we start with a single-element array. Then a two-element array. Then a four-element array. And so on and so on, doubling our bounded subarray until we find our element within some boundary. Once we find the element, we can then proceed with our normal binary search.
### Bisection method / numerical analysis
Do you have to divide things in half? The **bisection method** (which you can read more about [here](https://en.wikipedia.org/wiki/Bisection_method)) is a way finding the roots of values by repeatedly _bisecting_ an interval until the target is found. In other words, this is a more generalized form of binary search and can operate on the square root, cubed root, or any root of a number.
[Numerical analysis](https://en.wikipedia.org/wiki/Numerical_analysis#Computing_values_of_functions) is deeply rooted in mathematics. **If you're doing graphics programming or machine learning, you may find yourself utilizing the bisection method.**
## Bonus sorts: Shellsort & Radix Sort
I wanted to cover two other sorting algorithms really quick because they're super interesting and are great advanced algorithms for specific situations. They aren't really covered in the _Algorithm Design Manual_ but they do have practical applications.
**Shellsort** is like an optimized version of either _bubble sort_ (where you continuously swap smaller elements with larger elements) _insertion sort_ (where you insert unsorted elements into the sorted portion of an array), depending on how you look at it. Shellsort sorts in intervals known as _gaps_, and iteratively reduces the number of gaps it sorts by until the whole array is sorted.
For example, for an array of a randomized ordering of natural numbers (integers greater than 0) up to 100, shellsort would first sort the numbers [5,10,15,20,...100]. Then it would sort [3,6,9...99]. And that _gap_ interval would reduce down until it got to a gap of 1. But by the time it was sorting against every integer, all of the 5s were sorted, and all of the 3s were sorted, leaving only a fraction of the actual numbers left to sort.
Like bucketsort, shellsort requires a custom gap sequence depending on the application of your sorting algorithm. You can also choose either insertion sort or bubble sort as the inner function to perform on your gap subset. You can find an example [here](https://en.wikipedia.org/wiki/Shellsort#Pseudocode), but like I said, it's difficult to really nail in the general sense since it is highly dependent on the data you are sorting.
**Radix sort** (of which there are actually two) is arguably one of the fastest sorting algorithms known. If you really dove into the benchmarking link I posted above, it outperforms every other sorting algorithm we've mentioned here so far. Why is that the case?
At the end of the day everything is binary. Integers, strings, floating-point numbers, and everything else on a computer is all represent as binary numbers. Because of that, sorting objects by their binary code is a really convenient way to sort objects. Because of that, radix sort is really two algorithms, where you can sort starting on the least-significant digit or the most-significant digit of our object, now represented as a number.
One great example of using radix sort (specifically the MSD version) would be to [find the occurrence of a word](https://en.wikipedia.org/wiki/Radix_sort#trie-based_radix_sort) using tries to construct a word dictionary and then depth-first search to find the word you are looking for, starting with the first (or most significant) letter.
There are lots of other sorting algorithms that you can explore. As we move on from simple data structures and sorting to graphs and advanced tree searching, we'll demystify what exactly are things like tries and depth-first search. But this section was exhausting, so let's end with the Daily Problem and call it a day for now.
## Even more practice
If you just can't get enough of sorting problems, here are some more problems from the book to flex your sorting muscles:
1. 4-20
2. 4-22
3. 4-24
]]>2018-09-25T10:28:00+00:00Why Airbnb has pivoted away from React Native
https://www.adamconrad.dev/blog/why-airbnb-has-pivoted-away-from-react-native/
Thu, 20 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/why-airbnb-has-pivoted-away-from-react-native/2018-09-20T10:28:00+00:00Heapsort using Priority Queues in JavaScript
https://www.adamconrad.dev/blog/heapsort-priority-queues-in-js/
Thu, 20 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/heapsort-priority-queues-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/intro-to-sorting/), check that out first.
## Daily Problem Explained
> Give an efficient algorithm to determine whether two sets (of size `m` and `n`) are disjoint. Analyze the complexity of your algorithm in terms of `m` and `n`. Be sure to consider the case where `m` is substantially smaller than `n`.
This is just another exercise in writing pseudocode; a great problem for a programming interview! The only real catch here is you have to know what a [disjoint set](https://en.wikipedia.org/wiki/Disjoint_sets) is. A **disjoint set** is really just _two collections that have no items in common_. An example for this problem would be _`m = [1,2,3], n = [4,5,6]`_.
Let's just start throwing stuff out there and see what sticks. My first thought is that when you combine set of two disjoint sets is of length `m + n`, since no duplicates exist. That's pretty important because if your final set `S` is such that `S.length < m.length + n.length` you fail the disjoint part of the problem. Can we exploit that? Here's what I'm thinking:
```js
function isDisjoint(m, n) {
let finalSet = m;
for (let item in n) {
if (!finalSet.includes(item)) finalSet.push(item);
}
return finalSet.length === (m.length + n.length);
}
```
On first glance you only see 1 `for` loop so it looks like it's `O(n)` which is fast! But then we quickly realize the absolute fastest we can go is `O(n+m)` since we have to evaluate all items to duplication, so something is up. T
The key here is that `includes` has to inspect the entire `m` list for each `n` items. That leaves us with an algorithm of `O(mn)`. It's not bad but it's not great either; remember, that's roughly equivalent to `O(n^2)` (assuming `m` and `n` were the same length) which is quadratic. Quadratic functions are fine for relatively small data sets, but as we noted [in our article on Big O Notation](/blog/big-o-notation/), once we go above a million items in our dataset, the algorithm will be too slow. Can we improve on this?
One thing I quickly realized is we can leverage what we've learned so far as a hint on how to solve this. What data structures have we been studying carefully? Which of those are praised for their performance?
Then it hit me: **we can use a hash table to keep track of the number of times we've touched an item and if any already exist we know right away the set is not disjoint and we can exit early**.
Adding to and searching from a hash table takes an expected `O(1)` time, `O(n)` at worst. That means we can hit that magical runtime of `O(m+n)` in the worst-case using a hash table!
```js
function isDisjoint(m, n) {
let finalSet = new HashTable();
for (let item in m) {
finalSet.insert(item);
}
for (let item in n) {
if (finalSet.search(item)) {
return false;
} else {
finalSet.insert(item);
}
}
return true;
}
```
## Standard Selection Sort in JavaScript
The first sorting algorithm we're going to look at is **selection sort**. Selection sort's algorithm is very simple: _continue to take the smallest item from one array a place it into a new array_. By virtue of taking the smallest item from a recursively-shrinking array, you're left with a new array that is sorted. A naive implementation using standard JavaScript library functions looks like this:
```js
function selectionSort(arr) {
let sorted = [], min;
for (let i = 0, n = arr.length; i < n; i++) {
min = Math.min(...arr);
sorted[i] = min;
arr.pop(min);
}
return sorted;
}
```
As we can see from the algorithm above, selection sort performs `n` iterations, but each iteration will take roughly `n/2` steps in order to find the minimum value (we know this because finding a minimum value takes at worst `O(n)` but our problem space is being reduced with each successive iteration), giving us a grand total **worst-case runtime of `O(n^2)`**. Is there a way we can improve on this runtime?
## Selection Sort + Priority Queues = Heapsort
Look back to our previous algorithm - what are the primary operations for selection sort? Each iteration consists of a `min()`, an `insert()`, and a `delete()`. There is, in fact, a special data structure that combines `delete()` with `min()` to _efficiently_ extract the minimum value from its collection. This data structure is called a **priority queue**.
### Priority queues
Priority queues are flexible data structures that _allow for elements to enter the queue at arbitrary intervals while exiting in a sorted fashion_. Priority queues support three primary operations: _insert_, _findMin_ (or _findMax_), and _extractMin_ (or _extractMax_). For the purposes of simplicity going forward, we're going to focus on priority queues that implement the _minimum_ rather than the _maximum_ value.
A summary of their worst-case runtimes can be found below. **As an exercise: can you explain how these values are calculated?** Here's a hint: `findMin()` is always constant-time complexity because you can always use a variable to keep track of the minimum value as new values are inserted.
Data structure / method
findMin()
extractMin()
insert()
Unsorted array
O(1)
O(n)
O(1)
Sorted array
O(1)
O(1)
O(n)
BST
O(1)
O(log n)
O(log n)
### Implementing `extractMin()` and `insert()` with heaps
As I mentioned earlier, there is a special data structure that can efficiently handle the primary operations of selection sort. This data structure is known as a **heap**. Heaps are a variant of a priority queues that keep a rough track of the order of its items. This **ordering is smart enough to keep track of the minimum/maximum value, but not a complete sort which would significantly slow down the data structure**.
To implement a heap, we are effectively creating a _binary tree without worrying about pointers_, or in other words _an array of keys where the index of the array item _is_ the pointer_. The root of the tree is the first item in the array (and also the minimum value), with its left and right children corresponding to the 2nd and 3rd values in the array. In general, it looks something like this:
```js
class PriorityQueue {
const PRIORITY_QUEUE_SIZE = 500; //arbitrarily picked
constructor(arr, n) {
this.queue = new Array(PRIORITY_QUEUE_SIZE);
this.length = 0;
for (let i = 0; i < n; i++) {
this.insert(arr[i]);
}
}
parent() {
if (this.length === 0) return -1;
return Math.floor(this.length / 2);
}
child(n) {
return 2 * n;
}
}
```
Remember: we don't care about what's inside of each slot in our priority queue, we just care about finding elements at specific locations. That is why we can calculate locations with just some multiplication and division rather than having to search by value. Because we're not pointing to any specific values, the index of our elements is a reflection of the math we are performing, and when we perform basic operations like `*` and `/`, we know we're dealing with fast computational operations in constant time.
The downside to a heap is that we can't really do an efficient search in a heap. Because we're only concerned about keeping the smallest item at the root, everything else is arbitrarily packed into the tree. The tree items are packed to maintain space efficiency, but remember this isn't a binary _search_ tree, just a binary tree (i.e. each node has two children).
#### Insertion
As we mentioned earlier, save for the smallest item, the heap is just packed in from the top down, filling in from left to right before moving down to the next layer of the tree. This keeps the heap efficiently packed. To ensure the minimum-dominance relationship is maintained, we have to make sure we can swap up the tree as far as necessary to satisfy our minimum constraints.
```js
class PriorityQueue {
// ...
insert(item) {
if (this.length >= PRIORITY_QUEUE_SIZE) {
throw new Error("Priority queue overflow");
} else {
this.length += 1;
this.queue[this.length] = item;
this.bubbleUp(this.length);
}
}
bubbleUp(parent) {
if (this.parent(parent) === -1) return;
let currentParrent = this.queue[this.parent()];
let parentContender = this.queue[parent];
if (currentParrent > parentContender) {
[currentParent, parentContender] = [parentContender, currentParrent];
this.bubbleUp(parent);
}
}
}
```
Analyzing the above two functions, we see that `insert()` has no loops and emulates a tightly-packed binary tree. We know the height of a binary tree is `lg n` (shorthand for `log n` of base 2), so our worst-case runtime is `O(log n)`.
The `bubbleUp()` has the primary function of swapping elements (remember from ES6 the swap appears as `[x,y] = [y,x]`) which takes constant time. Therefore, to create a heap with `n` items it will take `O(n log n)` to insert and construct.
#### Extracting the minimum value
Finding the minimum value is easy, since it is the root of the tree. Actually pulling it out of the array is the hard part, since this leaves the tree broken in two at the root. To handle this, we'll need to re-assign the right child of the root's children to be the new root.
Why the right-most element? Because _we pack from left to right, so we want to unfurl our array in a clean manner from the opposite direction_. If the value isn't correct in the minimum-dominance relationship, we can always bubble _down_ the array (or "heapify") to swap out for the correct minimum.
```js
class PriorityQueue {
// ...
extractMin() {
let min = -1;
if (this.length <= 0) throw new Error("No minimum: Empty priority queue");
min = this.queue[0];
this.queue[0] = this.queue[this.length];
this.length -= 1;
bubbleDown(1);
return min;
}
bubbleDown(parent) {
let childIndex = child(parent), minChild = parent;
for (let i = 0; i <= 1; i++) {
if (childIndex <= this.length) {
if (this.queue[minChild] > this.queue[childIndex + i]) {
minChild = childIndex + i;
}
}
}
if (minChild !== parent) {
[minChild, parent] = [parent, minChild];
bubbleDown(minChild);
}
}
}
```
Similarly for extracting the minimum value, bubbling down only goes down a tree height of `lg n`, so at worst finding and removing the minimum value will take `O(log n)` time.
## Selection Sort using Priority Queues as Heaps
Putting it all together, we now have our optimized selection sort, known as _heapsort_, which combines everything we've learned here into an efficient extraction machine:
```js
function heapsort(arr, n) {
let heap = new Heap(arr, n);
for (let i = 0; i < n; i++) {
arr[i] = heap.extractMin();
}
return arr;
}
```
How elegant! And the best part: since our priority queue's operations run in `O(log n)` time, over a set of `n` elements, **heapsort improves on selection sort from `O(n^2)` to `O(n log n)`**, the gold standard for sorting algorithm time efficiency!
Even better, we did it _in-place_, meaning we didn't have to instantiate a second array to copy the elements over. We were able to use the same array to swap elements and create our sorted list of items.
Overall, this meaps heapsort an excellent algorithm for both its space and time efficiency.
Now there is a _slightly_ more advanced way of designing the constructor such that we populate the first `n` positions in heap directly from our input array (meaning we've created a one-sided heap) and then use a bunch of `bubbleDown()` calls (`n/2` to be exact since half of the items are already on the correct side of their parents) to balance out the array and find our minimum. This speeds up construction from `O(n log n)` to `O(n)` on average, but since construction is not the main operation of a heap, and the worst-case operations still take `O(n log n)`, this isn't too big of a win, **but you're welcome to implement it for yourself if you're looking for an extra exercise in algorithm design**.
## Wrapping up with the next Daily Problem
We've covered a lot with selection sort, priority queues, and heapsort. We've seen we can pretty easily cover sorting in `O(n^2)` time, but with a few optimizations, we can improve that time to the optimal `O(n log n)`. This will be useful in tackling today's Daily Problem:
> You are given a collection of `n` bolts of different widths, and `n` corresponding nuts. You can test whether a given nut and bolt together, from which you learn whether the nut is too large, too small, or an exact match for the bolt. The differences in size between pairs of nuts or bolts can be too small to see by eye, so you cannot rely on comparing the sizes of two nuts or two bolts directly. You are to match each bolt to each nut.
>
> 1. Give an `O(n^2)` algorithm to solve the above problem.
>
> 2. Suppose that instead of matching all of the nuts and bolts, you wish to find the smallest bolt and its corresponding nut. Show that this can be done in only `2n − 2` comparisons.
>
> 3. Match the nuts and bolts in expected `O(n log n)` time.
Think you have an answer? Create a [gist](https://gist.github.com) and send it to me [on Twitter](https://twitter.com/theadamconrad) and let's compare notes!
## More problems to practice on
Now that [homework 2](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw2.pdf) is out, here are a few problems that are relevant to the sorting we've discussed so far:
1. 4-1
2. 4-2
3. 4-5
4. 4-6
5. 4-12
6. 4-13
7. 4-14
8. 4-15
]]>2018-09-20T10:28:00+00:00Introduction to sorting algorithms in JavaScript
https://www.adamconrad.dev/blog/intro-to-sorting/
Tue, 18 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/intro-to-sorting/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/hashing-in-js/), check that out first.
Over the next several articles in this series, we are going to implement many of these basic sorting algorithms in JavaScript so you can compare and contrast the approaches and even find which are used in the JavaScript language today.
## Answers to the Daily Problem
This round our Daily Problem is back to coming directly from the book, practice problem 4-3:
> Take a sequence of `2n` real numbers as input. Design an `O(n log n)` algorithm that partitions the numbers into `n` pairs, with the property that the partition minimizes the maximum sum of a pair. For example, say we are given the numbers `(1,3,5,9)`. The possible partitions are `((1,3),(5,9))`, `((1,5),(3,9))`, and `((1,9),(3,5))`. The pair sums for these partitions are `(4,14)`, `(6,12)`, and `(10,8)`. Thus the third partition has 10 as its maximum sum, which is the minimum over the three partitions.
There's a lot of words here but we can reason about this another way: the pattern here is to minimize the largest sum is pairing the maximum number with the minimum number, and working inward.
We first need to sort (which takes `n log n`) the numbers, and then pair them off by taking the `maximum()` and the `minimum()` number from the set, plucking them out of the old set and placing them into a new set. Finally, we take the largest number from those pairs and that's the maximum sum that is the minimum across all possible permutations of the problem. In pseudocode, it looks something like this:
```js
function minimumMaxSum(set) {
let sortedSet = set.sort();
let pairs = [];
while (sortedSet) {
let maximum = sortedSet.maximum();
let minimum = sortedSet.minimum();
pairs.push(maximum + minimum);
sortedSet.remove(maximum);
sortedSet.remove(minimum);
}
return pairs.maximum();
}
```
## Applications of sorting
If you're wondering why sorting gets so much attention in interviews, it's because sorting is so important and so common. In fact, it is reported that [nearly a quarter of all computational time](https://www.johndcook.com/blog/2011/07/04/sorting/) is spent in just sorting data. Because it's so common, we have developed dozens of algorithms to tackle this problem, and is in fact one of the most studied problems in computer science. Here are just a few of the common use cases for sorting:
* **Searching:** If you want to find stuff, it's going to be way easier to find stuff if it's organized. Can you imagine searching for books in a library if everyone just stuffed their books in there arbitrarily?
* **Frequency / Uniqueness:** Remember things like _mean_, _median_, and _mode_ from math class? It's much easier to find the _mean_ and _median_ if you know they are roughly in the middle of your set. Similarly, the _mode_ will always be grouped together in the largest group of a set; all made easier by sorting.
* **Selection:** Like search, picking something out of a set is much easier when it's in a group of sorted items. Otherwise, the thing you want to pick is in a random group and may take longer to find.
## Approaches to sorting
When you think of sorting items, what comes to mind? Probably something like largest to smallest (or vice versa), and probably involving numbers. But lots of data can be sorted in lots of different ways, so before we dive into sorting algorithms it's important to remember that this is not the only way to sort. Here are a few more examples to consider when using sorting algorithms:
* **By key/record:** Just because we have the largest/smallest piece of data doesn't mean it is special. Sometimes it has to do with when/where that data was entered. When we enter data into a database with a unique incremental ID, we can sort by created-at date using the key of that record.
* **By alphabetical ordering:** Numbers increase and decrease, but so do letters based on their ordering in an alphabet. Even though there is no mathematical basis for A to be "less than" Z, there is a semantic and contextual basis.
* **By frequency:** Even if something isn't the newest or the largest, we may want to have the most popular/frequent data at the top to show to customers. Grouping by frequency and sorting on that can also be very important in picking out outliers.
Just like applications, the approaches to sorting are also numerous. What's important is not that this list is exhaustive, but that it is a _starting point for thinking about sorting as a fundamental activity of computer programming_. Whether or not you're a web programmer or an AI programmer, you _will_ have to sort some data while doing your job. Knowing how that works under the hood is advantageous, particularly if you're needing to optimize your programs for performance.
**One caveat: just because you should learn how this works doesn't mean you should implement your own sorting algorithms in a professional setting.**
This is important because while programming interviews may ask you to implement algorithms to commonly-solved problems, that doesn't mean you should _actually_ use them in the real world. What is important is that you know _how_ it works, not that you need this in your arsenal of tools to implement.
Next time, we'll look at one of these high performers that uses priority queues, a special variant on queues that we'll explore in-depth while we spell out our first sorting algorithm.
## Next Daily Problem
This one is pretty straightforward question, sort of something like you'd see out of a programming interview!
> Give an efficient algorithm to determine whether two sets (of size `m` and `n`) are disjoint. Analyze the complexity of your algorithm in terms of `m` and `n`. Be sure to consider the case where `m` is substantially smaller than `n`.
Think you have one in mind? Throw it up on a [gist](https://github.com/gist) and send it to me [on Twitter](https://twitter.com/theadamconrad)!
]]>2018-09-18T10:28:00+00:00Hashing in JavaScript
https://www.adamconrad.dev/blog/hashing-in-js/
Thu, 13 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/hashing-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/dictionaries-in-js/), check that out first.
## Answers to the Daily Problem
To solidify our understanding of dictionaries, the Daily Problem focuses on utilizing the primary functions of dictionary objects:
> You are given the task of reading in `n` numbers and then printing them out in sorted order. Suppose you have access to a balanced dictionary data structure, which supports each of the operations search, insert, delete, minimum, maximum, successor, and predecessor in `O(log n)` time.
So basically it's asking us to use some creative ju-jitsu with dictionaries in various situations.
> 1. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, successor, insert, search.
Sorting in `n log n` is a pretty common worst-case for good sorting algorithms. It basically means we have to touch all `n` numbers and then divide and conquer the sorting part (which is `log n`). With these 4 functions available, we can sort with something like this:
The book does this slightly differently - first it adds all of the numbers to the list, then it sorts it using the `successor()` function, but I'm assuming we already have a list that we can copy into a new one by starting with the `minimum()` object and inserting it, _presorted_, into the new dictionary.
```js
function sort(n) {
let m = 0, min = minimum(n), newDict = new Dictionary();
while(m !== n) {
newDict.insert(min);
min = n.successor(min);
m++;
}
return newDict;
}
```
> 2. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, insert, delete, search.
In this case, we keep stripping off the smallest item from the old list of `n` numbers. Each time we grab the minimum, we add it into the new dictionary, and then delete it from the old, unsorted dictionary.
This is actually a perfect example of what the book refers to as _recursive data structures_. When we remove one item from the dictionary, we are left with just a smaller dictionary. We can exploit this by recursively grabbing the smallest item and adding it to our new dictionary to create a sorted object.
```js
function sort(n) {
let min, newDict = new Dictionary();
while (n) {
min = minimum(n);
newDict.insert(min);
delete(n, min);
}
return newDict;
}
```
> 3. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: insert and in-order traversal
```js
function sort(n) {
let newDict = new Dictionary();
for (let i = 0; i < n.length; i++) {
newDict.insert(i);
}
console.log(newDict.inorderTraverse());
return newDict;
}
```
This is the best I could think of, I actually don't really understand why the book thought this was a good solution. The idea is to copy all of the items into the new list, and then do the sorting by printing out the traversal of all items in-order. This is more congruent with the rest of their sorting algorithms, whereby they insert stuff first, and then handle the sorting part. This isn't really sorting so much as it is taking advantage of using a dictionary like a Binary Search Tree and then traversing by binary search.
## Arrays + Linked Lists = Hash Tables
While dictionaries can be implemented by using arrays or linked lists, we can get significant efficiency improvements by implementing dictionaries using arrays _and_ linked lists. The key to connect those two is to use _hashing_.
A _hashing function_ is just a way to translate something into a numeric representation. There is [no one singular good hash function](https://en.wikipedia.org/wiki/List_of_hash_functions) (e.g. [SHA-256](https://en.wikipedia.org/wiki/SHA-2), [BSD checksum](https://en.wikipedia.org/wiki/BSD_checksum), [Rabin fingerprint](https://en.wikipedia.org/wiki/Rabin_fingerprint)), but what matters is that **it's cheap to calculate and it's evenly distributed**. The number from that function can then be used as the index in an array.
```js
class HashTable {
ENGLISH_ALPHABET_LENGTH = 26;
TABLE_SIZE = 500;
hash(str) {
let sum = 0;
for (let c = 0, len = str.length; c < len; c++) {
sum += Math.pow(ENGLISH_ALPHABET_LENGTH, len-c+1) * char(c);
}
return sum % TABLE_SIZE;
}
}
```
So that's a sample hashing function to translate letters into really big numbers. Why are we making the number super big? So there's a high chance that the key (the word that represents something like an array index for our dictionary) we are translating will be a unique number.
Now we can expand on our `HashTable` to start implementing our dictionary functions!
```js
class HashTable {
// assume we have the hashing function and constants
// that we've already implemented
constructor() {
this.table = new Array(TABLE_SIZE);
}
search(item) {
let list = this.table[hash(item.key)];
return list ? list.search(item.value) : null;
}
insert(item) {
let idx = hash(item.key);
let list = this.table[idx];
if (list) {
list.insert(item.value);
} else {
list = new LinkedList();
list.insert(item.value);
this.table[idx] = list;
}
}
delete(item) {
let idx = hash(item.key);
let list = this.table[idx];
if (list) {
// delete may have to copy list
// in the event the linking is broken
list.delete(item.value);
}
}
}
```
Now you can see how the two data structures make a hash table a great implementation of a dictionary! We get the fast search benefits of an `Array`, with the faster inserts/deletes and safety of a `LinkedList` in the event that multiple strings happen to map to the same hashing function. That means **all of these operations are expected to run in constant `O(1)` time, and `O(n)` in the worst-case**, the difference being the event of a hash collision.
If we have no or few collisions, aided by an excellent hashing function, then the linked lists are never filled with more than item, so search and modification is instant. If we have a bad hash function or our luck is bad (_luck_ meaning that we happen to choose words whose hashing functions all result in the same number) then we'll have to operate on just one long linked list which is no different than the `O(n)` runtimes we calculated for lists.
## Practical applications for hash functions and hash tables
With a `HashTable` implementation in our repertoire, we can start to identify use cases when this data structure is good for us to utilize. In general, **hash tables and hashing functions help speed up algorithms that would take `O(n log n)` or `O(n^2)` and turn them into linear time (`O(n)`) algorithms**.
### Pattern matching strings with Rabin–Karp
_Strings_ are an array of characters with the order preserved so we can express things like words in a sentence. We can find patterns in our words (like substrings, palindromes, etc.) using searching algorithms that make use of a hashing function. One such algorithm is _Rabin–Karp_.
```js
function rabinKarp(str, pattern) {
let hashPattern = hash(pattern);
let patternLength = pattern.length;
for (let i = 0, len = str.length - patternLength; i++) {
let substring = str.substring(i, i+patternLength-1);
let hashString = hash(substring);
if (hashString === hashPattern) {
if (substring === pattern) {
return i;
}
}
}
return false;
}
```
The idea here is that we take advantage of the fact that our hash functions return unique but predictable numbers for strings of words. That means that **if we find the same big number inside of our string as the hash number of our pattern, it's probably a match**.
I say "probably" because as we know from our hash table implementation, there is a slight possibility of hash collisions that could generate false positives. Luckily, this is exceedingly rare as the length of our table is large and a prime number (primes of course can only be divided by 1 and themselves, making them harder to have collisions since other numbers can't divide into the same answer).
### Duplication
Since Rabin–Karp helps us detect patterns within a string, we can apply this algorithm to a variety of real-world scenarios that exploit the fact that duplicate hash values indicate that strings are the same:
1. **Unique web pages on the web.** If web pages are just one long document of words and links, then the entire text of the HTML document can be represented as one giant number (via hashing, of course). For crawlers like Google, rather than having to slowly crawl all of the words of a document to verify it is unique, it just has to see if the hash value is in its mega list of hashes. If it isn't we know we have a new web page to add to the crawler.
2. **Plaigarism.** Copying text can be easy to detect because the hashes should be the same, just like crawling web pages. The only problem is adding in a few filler words, some spaces, different formatting… with a few small changes you can create a unique hash and make it impossible to compare one document to the next. Instead, we can leverage pattern matching by scanning the document in chunks and seeing if the vast majority of sub-hashes are the same. This is obviously a downside because it requires significantly more space to store and compare those hashes, but this is still smaller than having to search against every single word betwee the two documents.
3. **Save when changed.** Just as we know that a hash function can numerically represent an entire HTML document, we can hash an entire file with a cryptographic hash to ensure that the document has not been modified by an unauthorized third party. You ever see messages from your laptop warning you that "this third-party application has not been signed?" This is exactly what this is for! Developers send over a hash with their saved changes to prove they were the last ones to touch their programs. They then send those hashes either to app stores or directly to their customers to let them know that their programs haven't been hacked by outside developers, ensuring that their programs can be safely run on other machines.
As you can see, hashing, whether with a table or as a simple function, is a very practical mechanism for dealing with human language data in the form of strings. Strings are the fundamental formal translation of words into arrays, and the hashes are the translators that turn these characters into numbers. This makes dealing with lots of string data very efficient when we hash them out into numeric data.
## The next Daily Problem (4-3)
This round our Daily Problem is back to coming directly from the book, with a preview into our next segment, sorting:
> Take a sequence of `2n` real numbers as input. Design an `O(n log n)` algorithm that partitions the numbers into `n` pairs, with the property that the partition minimizes the maximum sum of a pair. For example, say we are given the numbers `(1,3,5,9)`. The possible partitions are `((1,3),(5,9))`, `((1,5),(3,9))`, and `((1,9),(3,5))`. The pair sums for these partitions are `(4,14)`, `(6,12)`, and `(10,8)`. Thus the third partition has 10 as its maximum sum, which is the minimum over the three partitions.
]]>2018-09-13T10:28:00+00:00Dictionaries and Binary Search Trees in JavaScript
https://www.adamconrad.dev/blog/dictionaries-in-js/
Tue, 11 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/dictionaries-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/basic-data-structures-in-js/), check that out first.
## Answers to the Daily Problem
This problem basically asked us to combine what we learned from [Big O Notation](/blog/how-to-analyze-js-functions/) and apply that to one of the fundamental [data structures](/blog/basic-data-structures-in-js/) we saw in the last article.
> For each of the four types of linked lists (singly unsorted, singly sorted, doubly unsorted, doubly sorted), what is the asymptotic worst-case running time for each dynamic-set operation listed?
If you've been following along with this series, this should be pretty easy to figure out.
> a) Search(List, item)
Drawing on our playpen slide tube analogy, we know that we can only enter our `LinkedList` from either ends of the slide. The worst-case is that we enter the wrong end of the slide, and our target is all the way at the other end. So if we have to go through the whole list of `n` items, that means **our worst-case runtime for search is `O(n)` across all 4 kinds of lists**.
> b) Insert(List, item)
One of the benefits we mentioned in the previous article was that insertions and deletions were simpler because we can just tack on stuff to the ends of the list. Since we always have a pointer to the `head` of any kind of linked list, it only requires **constant-time access (`O(1)`) to insert an item onto any unsorted linked list**.
Why does sorting hurt us? Because once the list is sorted, _we can no longer tack on the item to the beginning of the list_. Now the list has some logical ordering behind it, we have to maintain that ordering, which could mean having to scan the entire list to find the exact right place to insert our new item, making **insertions a linear-time runtime (`O(n)`) for sorted linked lists.**
> c) Delete(List, item)
To limit confusion, let's right off the bat say **doubly-linked lists can implement this `O(1)` runtime because they have access to both the previous and next node**. Since you need both to connect them, and thus delete the active `item` by disconnecting it from the list, we're done here.
However, for singly-linked lists, this one is interesting because it depends on the algorithm you want to implement. If you want to delete the item _after_ `item`, this is a straightforward **worst-case for delete is `O(1)`** since you just connect `item.next` to the next of the next.
But if you want to delete the `item` provided, you need to find the previous node, and the only way to do that is to start from the beginning and work your way down until you see that the current node's `next` is equal to the `item` you specified. Only then can you connect `current` to the `next` of the next node, meaning **in the worst-case, deletes can take up to `O(n)` for singly-linked lists**.
> d) Successor(List, item)
This function just asks you to find the next logical item in the list. Logical in this case means that if you have a list that would be represented by an array as `[1,2,3]`, the _successor_ to `2` is `3`. Since logical ordering is based on sort order, this is entirely dependent on if the list is sorted or not, and nothing to do with being singly- or doubly-linked, since both of those implement a `next` property.
**For sorted lists, this is just an access to `next` and runs in `O(1)`, while unsorted lists may require viewing the whole list to finding the logical next item, and thus runs in the worst-case `O(n)`**.
> e) Predecessor(List, item)
Finally, we have something unique to tackle! This function is the first place we see fragmentation between singly- and doubly-linked lists. **For sorted doubly-linked lists, this, too, is `O(1)` because the previous node is a property inherent on the list object, and property access happens in constant time**. This doesn't apply for unsorted doubly-linked lists, because like our quandry with question D, it could take viewing the entire list to find our logical predecessor, and therefore **unsorted doubly-linked lists take `O(n)` to run the predecessor function.**
For singly-linked lists, this presents a problem. If you only ever have access to `next`, how can you possibly know what the `prev` item is? Another way to look at that is to flip the script: rather than looking at the predecessor as the previous node of the current item, we instead want to look for _the previous node's next node_. Yes, see, because we know that the previous node's next node is the same as the current node (`item`) that we're passing in! But to do this we have to start from the very beginning of the list. And if our current `item` is all the way at the end of the list, we'll have to search the entire list of `n` items until we find the predecessor, therefore **singly-linked lists can calculate the predecessor in `O(n)` time in the worst-case, regardless of whether or not they are sorted**.
> f) Minimum(List)
Here we just want to find the minimum item in the list. Whether that's the lowest ordered letter in a list of letters in the alphabet, or the lowest number in the list, we want to find the minimum value. Now it should start to make sense as to why there were 4 list types, one group centered around sorting. When the list is not sorted, you really have no idea where the minimum item could be, so **in the worst-case scenario you have to search the entire list to find the minimum item, which is `O(n)`**. However, if the list is sorted, you know that it is going to be at the very start of the list, which you already have access to in both singly- and doubly-linked lists, and therefore **the minimum value can be found in a sorted linked list in constant `O(1)` time**.
> g) Maximum(List)
Finding the maximum is identical to finding the minimum, but not exactly for the reasons you think (again ,think of the ends of a slide). The obvious part is that **unsorted lists require searching the entire list, so maximum is found in `O(n)` time.** The other obvious connection should be that **sorted doubly-linked lists have access to both the first and last item so the maximum can be found in `O(1)` constant runtime**. So why is **maximum access found in constant time (`O(1)`) for sorted singly-linked lists**? When we add and delete items from a sorted singly-linked lists, we can use that as an opportunity to update a property of the `last` node in the list. By paying that storage cost in those functions, it allows us to have instant access to the last node, which in the case of a sorted list, also happens to be the maximum value as well.
Wow! That's a big _list_ of problems to solve! **As a thought exercise, see if you can't extend this table to include runtimes for sorted and unsorted arrays as well.**
## Thinking about dictionaries
**Think of a data structure dictionary as the eponym of the physical book dictionary is a good analogy.** Now I'm going to give myself some credit here because I think this is pretty clever: I purposefully chose a fancy word like _eponym_ because if you don't know what that means, YOU'RE GOING TO LOOK IT UP IN A DICTIONARY.
And when you look up that word at dictionary.com or whatever your favorite word site is, how do you search for the word? _By its name_. **Dictionaries operate exactly the same way as dictionary books do because searching for content happens by looking up the actual name of the content itself.** Now you know how a `search` method would be implemented.
What about `insert` and `delete`? Well how is a word and its definition laid out in a dictionary that you read?
> **eponym** noun _ep·onym | \ ˈe-pə-ˌnim_ : one for whom or which something is or is believed to be named
The name of the word is the **key** that you've looked up, and the definition of the word is the **value**. So whenever you want to add or remove a word in a dictionary, you _see if the key is available, and if it is, you add/remove the definition_.
As we saw from the Daily Problem, there are some additional (optional) methods you could add to a dictionary as well, such as `min`/`max` for quickly moving to the beginning/end of the dictionary, and `successor`/`predecessor` for moving forward/backward between keys/words (not necessarily forward/backword in memory), very similar to the `next` method we implemented for a linked list. We went into these pretty in-depth at the beginning, so suffice it to say we've thoroughly covered the 7 major methods for implementing dictionaries.
```js
let dictionary = {};
// insert
dictionary["eponym"] = "one for whom or which something is or is believed to be named";
// search
dictionary["eponym"];
// => "one for whom or which something is or is believed to be named"
// delete
dictionary["eponym"] = undefined;
// no different than dictionary["apple"], both are undefined
```
## Improving dictionary methods with Binary Search Trees
**Binary Search Trees (BST)** are a clever data structure that implements all 7 dictionary methods and takes all of the best parts of the dictionary runtimes. The result is a data structure that efficiently supports all of these functions.
BSTs consist of a _root_ or start node and trickle down into two other _child_ nodes: the _left_ and the _right_ child. The _left_ node is always less (in value) than the current root node, and the _right_ node is always greater (in valud) than the current root node. We can also have an optional _parent_ node reference which tracks in the opposite direction. This organization is what allows us to quickly and easily find any value in the tree.
**The best way to think of BSTs is to imagine the [Lady Justice](https://en.wikipedia.org/wiki/Lady_Justice) holding her balance scale with a heavier right-handed side.**
Why the right side? Since the right node is the larger item, if you imagine the tree holding numeric values, and those values represented the mass of the object on the scale, you know that the higher-valued item will weigh more:
```
6
/ \
4 \
8
```
If that right there is her scale, then you can clearly see that 8 weighs more than 4, so the scale will tip to the right in favor of the item that weighs more. An elaborate example? Maybe. An example you will remember? _Definitely._
```js
class Node {
constructor(value) {
this.value = value;
this.left = null;
this.right = null;
this.parent = null; // optional
}
}
class BinarySearchTree {
constructor() {
this.root = null;
}
search(item) {
let curr = this.root;
while (item.value !== curr.value) {
if (item.value < curr.value) {
curr = curr.left;
}
else if (item.value > curr.value) {
curr = curr.right;
}
}
return curr;
}
insert(item) {
let curr = this.root;
if (!curr) {
this.root = item;
return true;
}
while(curr) {
if (item.value < curr.value) {
if (curr.left) {
curr = curr.left;
} else {
curr.left = item;
break;
}
}
else if (item.value > curr.value) {
if (curr.right) {
curr = curr.right;
} else {
curr.right = item;
break;
}
}
}
return true;
}
maximum() {
let max = this.root;
if (!max.value) return null;
while (max.right) {
max = max.right;
}
return max.value;
}
minimum() {
let min = this.root;
if (!min.value) return null;
while (min.left) {
min = min.left;
}
return min.value;
}
preorderTraverse(item) {
if (item.value) {
console.log(item.value);
traverse(item.left);
traverse(item.right);
}
}
inorderTraverse(item) {
if (item.value) {
traverse(item.left);
console.log(item.value);
traverse(item.right);
}
}
postorderTraverse(item) {
if (item.value) {
traverse(item.left);
traverse(item.right);
console.log(item.value);
}
}
traverseTree() {
inorderTraverse(this.root);
}
}
```
**Question: Given our practice with Big O Notation, can you figure out the runtime of the methods within this sample `BinarySearchTree` class?**
**Extra Credit: We never filled in the `delete(item)` method for BSTs. To be honest, they're kind of tricky. How do you think you would implement one?**
Would love to see your answers and sample gists [on Twitter](https://twitter.com/theadamconrad).
## Balanced BSTs and the next Daily Problem
While BSTs are very good data structures most of the time, they can still be improved. **_Balanced_ BSTs will automatically rebalance nodes whenever new items are inserted or deleted into the tree.** For example, if you keep adding the natural numbers (1,2,3,4,5…) to a BST, it will be effectively the same thing as a linked list, with the height of the unbalanced tree _`h`_ equal to the length of the items _`n`_:
```
1
\
2
\
3
\
4
\
5
```
But if you _balance_ the tree so that we try to maximize as many _left_ and _right_ nodes being used as possible, the end result of a rebalance would look like this:
```
3
/ \
2 4
/ \
1 5
```
This time, the height is halved at every level, and we now know that if we are successively halving something, it's happening in a logarithmic fashion. Therefore **the height of the balanced BST _`h`_ is equal to _`log n`_**.
To get some more practice dealing with this concept for balanced BSTs, the Daily Problem focuses on them:
> You are given the task of reading in `n` numbers and then printing them out in sorted order. Suppose you have access to a balanced dictionary data structure, which supports each of the operations search, insert, delete, minimum, maximum, successor, and predecessor in `O(log n)` time.
>
> 1. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, successor, insert, search.
>
> 2. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, insert, delete, search.
>
> 3. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: insert and in-order traversal
Think you've got the answers? We'll find out in two days with the next article! And for more practice, check out these problems from the book:
1. 3-10
2. 3-11
]]>2018-09-11T10:28:00+00:00Basic data structures in JavaScript
https://www.adamconrad.dev/blog/basic-data-structures-in-js/
Thu, 06 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/basic-data-structures-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/). If you missed the [previous article](/blog/how-to-analyze-js-functions-with-big-o/), check that out first.
## Answers to the Daily Problem (2-20)
So this one was honestly a bit confusing to me. The question reads:
> Find two functions `f(n)` and `g(n)` that satisfy the following relationship. If no such `f` and `g` exist, write “None.”
The problems are odd because we don't really deal with Little O, which stands for "grows strictly smaller than." Little O is like a stricter version of Big O. This just isn't something you'd have to worry about in the professional world, either as a working developer or in an interview setting. It's also not a topic that is covered very much in an algorithms book. But let's give it a try anyway.
> (a) f(n) = o(g(n)) and f(n) != Θ(g(n))
So this is saying `g(n)` should grow at a smaller rate than `f(n)` should, _and_ should not be the average-case, meaning it can't satisfy both `O(g(n))` or `Ω(g(n))`. Basically, we want an f(n) that is not as bad as g(n), so a simple one would be **`f(n) = n + 1, g(n) = n^2 + 1`**.
> (b) f(n) = Θ(g(n)) and f(n) = o(g(n))
This one should be **none**, and the reason why is that `Θ` is all about being tightly-bound, whereas the little letters (Little O and Little Omega) are all about being loosely-bound. You can't have both, so the answer is none.
> (c) f(n) = Θ(g(n)) and f(n) != O(g(n))
This is also **none**, because the definition of `Θ(g(n))` assumes that you satisfy both the Big O and the Big Omega.
> (d) f(n) = Ω(g(n)) and f(n) != O(g(n))
This one should be fairly straightforward. We want something that satisfies Big Omega and not Big O, meaning we don't want a Big Theta set of functions. **`f(n) = n^2 + 1, g(n) = n + 1`** should satisfy this (essentially, the inverse of problem A).
## Data structures as blocks of memory
One way to group data structures is by how they allocate memory. I will admit, this was not an intuitive way for me to think about elementary data structures, but it's a good thing to keep in mind. There are two major forms of data structures by memory allocation:
### 1. Continuous
Continuous-memory data structures are those that are linked as one block of memory. So when a computer allocates memory in RAM for this structure, it will allocate one block so it can be accessed as one block. The computer won't have to use pointers to find chunks of addresses in order to make sense of all of the data for this one structure. There are lots of data structures of this type we'll cover later, but the only fundamental continuous data structure you need to be aware of is the **array**.
#### Arrays
Arrays are _the_ fundamental continuous data structure that everyone knows. In JavaScript, they have a `typeof === 'object'` but you can distinguish them with `[] instanceof(Array)`.
**Arrays are like [cubbies](https://www.google.com/search?q=cubbies&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj8qP_r1O3dAhWwTt8KHViDD1wQ_AUIDygC) at school.** A cubby is one big piece of furniture with slots to store things like hats, coats, and backpacks.
Like arrays, they store stuff, and they are one object. And like in JavaScript, cubbies can store anything that fits, we don't have any strict types that dictate how our arrays can be allocated like in statically-typed languages.
##### Advantages
* **Constant-time access to elements.** If you provide the _index_, or numeric address inside of the array, we know exactly where the value of that item is without any sort of transformation. That makes array access extremely fast.
* **Space efficiency.** If you only have to take up one block of memory, that's more efficient than if you have to take up lots of discontinuous, fragmented chunks of memory.
* **Colocality.** Building on the previous benefit, because it's one block of memory, that also means that all of the values of the array are located next to each other, making access via traversal also very easy. This allows computers to make use of their caches since we can easily plop the whole array onto (and off) of the cache.
##### Disadvantages
* **Fixed size.** The space efficiency comes at a cost that we can't modify the size of the array later on. If we allocate for 5 items, that's it.
There are two ways to mitigate this: _buffer space_ and _dynamic arrays_. With buffer space, we can leave extra empty room for the array to grow. So let's say you allocate an array for 5 items, well we can buffer in 5 more for a total of 10 allowable items. The obvious downside here is that there is wasted space in anticipation for utilizing arrays for purposes we didn't intend.
The other option is to use dynamic arrays. Dynamic arrays will allocate double the space of the current size of a filled array and copy its contents over to that newly-allocated block. It will then free up the old space that was allocated for the original array. This is like a smart version of buffer space that only adds in the buffer when the array is filled.
It still suffers from the same downside of potentially wasted space, plus the additional operational complexity of allocation and deallocation (`O(n)`). This cost is still `O(n)` because the amount of reallocation work is the convergence of a geometric series (i.e. we double the size half as often because the amount of slots required to fill double with each expansion) which in the end works out to `O(n)`. It also can't guarantee that access will be in constant-time, since the boundary conditions for accessing elements that double the array will now take `O(n)` time.
```js
var myArray = [1,2,3];
```
### 2. Linked
Linked data structures, as you might guess, differ from continuous data structures because they are distinct blocks of memory _linked_ together to form what looks like one cohesive structure. The linking mechanism is called a _pointer_.
A pointer acts as a reference to an address in memory. So while your house is at 742 Evergreen Terrace in Springfield, it's the Evergreen Terrace part that serves as the pointer to your house. In programming, **we make heavy use pointers by passing the value of the pointer around in languages like C and C++.** In JavaScript, however, no such functionality exists.
Instead, **we pass objects by reference in JavaScript**. That means we make a copy of the object as a reference to the original object we want to find, rather than the address we want to use to find it in memory.
In this article we'll cover several of these kinds of fundamental linked data structures, including:
#### Linked lists
I have memories/nightmares of my very early college days doing phone screens for my internships with questions like:
> Tell me how to reverse a singly-linked list in place.
It's funny, looking back now it actually _is_ a good question to ask a sophomore in college because he has no real professional experience to lean on to reply with something snarky like:
> Why would I know how to do that? My list library already has the optimal algorithm.
Anyway back to the task at hand.
I like to **think of linked lists like [those colorful play slides](https://sc02.alicdn.com/kf/HTB1VYtyHpXXXXX3aFXXq6xXFXXXK/200823934/HTB1VYtyHpXXXXX3aFXXq6xXFXXXK.jpg) you see on playgrounds**. There's one way to enter: at the top (or the start of the list). You can't see what's inside the slide/list until you slide down/traverse the list. And each piece of the slide/list is a different color/value. And the only way to get from one section/item to the next is to slide down the slide/iterate to the next time.
And for doubly-linked lists, those are like the crazy kids who would get to the bottom of the slide and dare to climb back up inside to the top. Sure, they're kind of crazy and mildly stupid (the climbers, not the lists), but the reward for the extra work (of climbing or implementing the extra functionality) is increased flexibility and exploration. Now obviously there is more of an implementation cost to doubly-link a list, but this cost is negligible, so it's basically always worth implementing.
##### Advantages
* **Memory is dynamically allocated.** Each time we add a `LinkedList` item to the end of the list, we allocate some more memory. No more and no less than exactly what we need. This is advantageous to arrays because of that space efficiency.
* **Insertion and deletion is simpler.** Since list manipulation operates on the ends of the list, which we have direct access to, these operations are extremely simple, as in, constant-time (`O(1)`) simple .
* **Sorting/rearranging lists is simpler, too.** Since the chain between list items is just a pointer to the next item, we don't have to worry about physically changing the memory space to rearrange the list. All we have to change is who is pointing to who. This is particularly beneficial for really large objects because those objects do not have to move. Moving and finding really large blocks of memory is far more difficult and costly than moving pointers which are of a very tiny fixed size.
##### Disadvantages
* **More overall space.** You not only need space to store your value, but also to point to your other list items.
* **Worse colocality and caching performance.** Since there is no requirement that all of your `LinkedList` items have to be part of the same memory block, it's much more difficult to cache since you could theoretically have to search all of your RAM to find the memory spaces to throw the list into the cache.
* **Random access is slower.** You can't just pluck an item with an index. Remember, this is like a slide. The only way in or out is from the ends, there is a barrier around the middle items of the list. Only from the ends can you make your way to the section of the list that you desire.
```js
class Node {
constructor(value) {
this.value = value;
this.next = null;
}
}
class LinkedList {
constructor() {
this.head = null;
}
delete(item) {
let prev, curr = this.head;
while (curr.value !== value) {
prev = curr;
curr = curr.next;
}
prev.next = curr.next;
curr = null;
}
search(value) {
let curr = this.head;
while (curr.value !== value) {
curr = curr.next;
}
return curr;
}
insert(item) {
item.next = this.head;
this.head = item;
}
}
```
#### Stacks and queues
Once you've grasped linked lists, stacks and queues are pretty easy to understand. **Stacks and queues are just specialized linked lists.**
Technically, both of these can be implemented using a linked list _or_ an array. It is arguably easier to make a queue with a linked list and a stack as an array because of the way the insertion and deletion functions are implemented respectively for arrays and lists.
##### Stack
**A stack is exactly what you think it looks like.** A stack of pancakes. A stack of cards. A stack of whatever.
When you throw pancakes onto a plate, what's the pancake you see at the top? The _last_ one you put on the plate. When you start eating the pancakes, what's the first one you dig your fork into? _Also the last pancake_. This actually has a name: **Last-In-First-Out, or LIFO**.
In the jargon of computer science, really the only difference with a stack and regular old linked list is `insert` is now called `push`, and `delete` is now called `pop`. In fact, you may recognize that because [those are methods on Array.prototype](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array#Methods_2).
That's right, you could actually build a very simple `Stack` class by limiting the scope of an `Array`'s modification to just `push()` and `pop()`.
Why would you need one over an array? Stacks are an ideal data structure for handling depth-first search. Just like you dig deeper into that stack of pancakes, so should you think about using stacks with DFS. We'll get more into search algorithms in a later article.
##### Queue
**A queue is also an aptly-named structure for the thing you do when you wait in line at the deli or the DMV.**
What's that process like? You take a ticket, it has some number on it, and you wait until your number is called. When you are waiting in line, how do they process the orders? From the lowest ticket number to the highest, or in other words, the time people entered the queue. This, too, has a name: **First-In-First-Out, or FIFO**.
Again, this is just a specialized linked list with different names for `insert` and `delete`. Now they are called `enqueue` and `dequeue`. JavaScript has [these methods]([those are functions on Array.prototype](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array#Methods_2) as well, but they are called `push` and `unshift` (or `shift` and `pop`, depending on how you look at it).
No, that's not a typo, `push` is used twice on purpose. With queues, think of it like the tickets: when a number is called from the deli, we're taking the oldest ticket with the lowest number, that's at the _front_ of the queue, or the beginning of the list. When the list was empty, we added to the end of it. As more people are added to the list, we keep adding them to the end of the queue. But when the person behind the deli counter calls our number, where are we in line? _Right in front_, so we don't remove from the end of the list, _we remove from the beginning_.
Why would you need one over a linkedt list? Queues are an ideal data structure for handling breadth-first search. Just like people shuffle across the floor when waiting at the deli counter, so should you think about using queues with BFS. Again, we'll save search algorithms for later.
Okay, I think that's enough for now. Next article we'll cover dictionaries, but right this moment we'll move onto the Daily Problem.
## The next Daily Problem
This one is not in the book, so I'll just transcribe it here:
> For each of the four types of linked lists (singly unsorted, singly sorted, doubly unsorted, doubly sorted), what is the asymptotic worst-case running time for each dynamic-set operation listed?
>
> a) Search(List, item)
>
> b) Insert(List, item)
>
> c) Delete(List, item)
>
> d) Successor(List, item)
>
> e) Predecessor(List, item)
>
> f) Minimum(List)
>
> g) Maximum(List)
Pretty straightforward: for the functions that operate on a linked list, give the Big O time complexity. If you've followed along with this and the previous articles then this question should take very little time to answer. And for more practice, check out these problems from the book:
1. 3-2
2. 3-4
]]>2018-09-06T10:28:00+00:00How to analyze JavaScript functions with Big O
https://www.adamconrad.dev/blog/how-to-analyze-js-functions-with-big-o/
Tue, 04 Sep 2018 10:28:00 +0000https://www.adamconrad.dev/blog/how-to-analyze-js-functions-with-big-o/ For each of the following pairs of functions `f(n)` and `g(n)`, state whether `f(n) = O(g(n))`, `f(n) = Ω(g(n))`, `f(n) = Θ(g(n))`, or none of the above.
It's just a fancy way of saying, "we have a function `f(n)`, is its `g(n)` the best-case, worst-case, or average-case runtime for our function?" Let's take a look:
> (a) `f(n) = n^2 + 3n + 4, g(n)= 6n + 7`
I'm going to offer you the trick I used. As was mentioned previously, we can reduce the `g(n)` down to just the highest-powered variable without the constants. So that means I'm really just looking at whether `n^2 + 3n + 4` is compared to `O(n)`, `Ω(n)`, or `Θ(n)` for this to realistically apply.
For `f(n)`, the worst-case can't be any lower than `n^2`, so that's out. We also know that if worst-case can't work, then neither can the tight-bound theta (since theta requires that _both_ worst and best case can apply). Therefore, we do know the best-case can be no better than `n` given `g(n)` has an `n` in it, so we can safely say that **`f(n) = Ω(g(n))`** is the correct answer here.
> (b) `f(n) = n√n, g(n) = n^2 − n`
This one is a bit trickier, but we can use a neat trick for finding the powers for comparison.
`√n` is really the same thing as `n^0.5`, and when we multiply that with `n`, we're really just saying that `f(n) = n^1.5`. So now we're really just comparing `n^1.5` to `n^2`, which means that `f(n)` clearly has the lower order variable compared to `g(n)`, so **`f(n) = O(g(n))`** is the correct answer.
> (c) `f(n) = 2^n − n^2, g(n) = n^4 + n^2`
Applying the trick here again, we have an exponential function on the left, and a quadratic function on the right. Quadratic functions are lower on the [dominance relation](http://bigocheatsheet.com/), so we can that **`f(n) = Ω(g(n))`** is the correct answer here as well.
## Big O Analysis by example
Now that we've gone over some hypothetic mappings of calculating computational operations, let's actually apply this to real-world algorithms in JavaScript. The most applicable form of Big O Notation in a professional setting would be if you were asked to evaluate the time complexity of the algorithm you've just written on the whiteboard/typed on your computer.
In the following examples, we'll provide the JavaScript code for some algorithm, and then we'll do a Big O analysis line-by-line to show you how we can provide an `f(n) = O(g(n))`.
### Sorting items in an array with Selection Sort
Selection sort is an algorithm that sorts an array by _selecting_ the lowest remaining value and putting it at the end of the sorted portion (the sorted portion is allocated to the front of the array). It looks something like this:
```js
function selectionSort(arr) {
let min;
const arrLength = arr.length;
for (let i = 0; i < arrLength; i++) {
min = i;
for (let j = i + 1; j < arrLength; j++) {
if (arr[j] < arr[min]) min = j;
}
[arr[i], arr[min]] = [arr[min], arr[i]];
}
return arr;
}
```
If the last line before the brackets seems weird to you, this is how you can [swap variables using ES6 destructuring](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment#Swapping_variables).
This is also a great way to illustrate the appropriate uses of `let` and `const`. `let` makes sense for variables like `min`, `i`, and `j` because they are being modified over and over again. `const` makes sense for `arrLength` because the length of the array never needs to change, so we can keep it constant.
So how do we know how much this will cost in terms of Big O time complexity?
Remember the 3 rules from the previous article:
> 1. **Add 1 for simple operations**.
>
> 2. **Add `n` for looping over a set**.
>
> 3. **Whatever your calculations were for 1 and 2, multiply them if the algorithm is called again recursively**.
So let's go through this line-by-line, ignoring lines for whitespace and closing brackets because those are just syntax and are not evaluated:
The first line is the function declaration, and while that requires computation, we're evaluating what is _inside_ of the function, so we can skip this line.
The next two lines are variable declarations, which apply rule 1, and take a constant amount of time (this includes the `length` function on the array). So let's call our total `2` right now.
Now we've hit our first `for` loop. How many elements are we iterating over? From zero up to the length of the array, which we can mark as `n` since that's been the variable we've stuck with for all of our Big O calculations thus far. This brings our total up to `n + 2`.
The next line brings another variable assignment, bringing our total up to `n + 3`.
Now we've hit another `for` loop. But this time it's _inside_ the other `for` loop. How would that work?
Imagine you have 5 cards in your hand, each numbered 1 through 5. Take the first card and add it up against every other card, including itself. How many additions have you made? It should be 5. Now do that for the second card, adding up the other four cards plus itself. The number of additions? Still 6. If you keep repeating this, each card has interacted with each other card, for a total of 25 operations. 25 = 5*5, so if we generalize this for `n` cards, the total is `n*n = n^2`. Therefore, our nested `for` loop is _multiplying_, not adding onto our outer `for` loop.
Since our outer `for` loop thus far has a variable declaration and the loop itself, we're looking at a total of `(n-1)*(n+1) + 2`. Why `n-1`? Because we declare `j = i+1`, so the inner loop will only ever see, at most, 1 less than the entire array.
Next, we have a bunch of things going on all places onto one line. We have a conditional statement, some variable access in the array, and finally a variable assignment (if the condition holds).
This part gets a bit tricky. We know that all of these items add a constant amount of cost in terms of computation. The problem is that with a conditional assignment (like with our `if` statement) the chance of making the condition based on the sum of all of the remaining items in the array to evaluate. That sum is equal to `(n-1) + (n-2) + ... + 2 + 1`, which is approximately equal to `n(n-1) / 2`, bringing the total to `(n-1)*(n+1) + n(n-1)/2 + 2`.
Next, we have our JavaScript destructuring which is acting as a sort of swap function. Since these are just assignments of two variables, we know this is just a constant addition to our operations total within the outer `for` loop, bringing us to `(n-1)*(n+3) + n(n-1)/2 + 2`.
Finally, we return the array to display its newly-sorted self. Referencing a variable is just as costly as assigning one in terms of computation, and now that we're back into the base block of our function, we've completed our cost analysis with a grand total of `(n-1)*(n+3) + n(n-1)/2 + 3`. When we multiply everything out, we get `n^2 + 3n - n - 3 + .5n^2 - .5n`, which simplifies to `1.5n^2 + 1.5n - 3`.
So coming back to the actual asymptotic analysis for O, if we say that `f(n) = 1.5n^2 + 1.5n - 3` and we want to do all of the same simplification techniques when evaluating this to be equal to `O(g(n))`, we're left with the largest power which is just **`O(n^2)`**.
### Matrix multiplication
Did that feel like a lot of work? It should! Because with this next example, we're going to _drastically_ simplify our analysis, even though the algorithm is more complicated.
Matrix multiplication means taking two 2D arrays and multiplying them together to get the multiplied value of all of the cells. If one 2D array is of dimensions `x,y` and the other is `y,z` (they have to have at least 1 dimension in common else you can't actually perform multiplication on the two), we return an array of dimension `x,z` using the following algorithm:
```js
function multiplyMatrices(arr1, arr2) {
const x = arr1[0].length;
const y = arr1.length;
const z = arr2.length;
let matrix;
if (y !== arr2[0].length) {
throw new Error("Matrix cannot be multiplied");
}
// initialize matrix
for (let a = 0; a < x; a++) {
matrix[a] = [];
}
for (let i = 1; i <= x; i++) {
for (let j = 1; j <= z; j++) {
matrix[i][j] = 0;
for (let k = 1; k <= y; k++) {
matrix[i][j] += arr1[i][k] * arr2[k][j];
}
}
}
return matrix;
}
```
Much more complex, but with this trick, we can calculate this very quickly: **scan for nesting and multiply**.
The first few lines are constants, so worst-case now is `O(1)`.
Our first `for` loop of length `x`, now we know worst-case is `O(n)`.
We have a 2nd `for` loop where the multiplication starts, but since it's at the same nesting as the first one, we're still at a worst-case of `O(n)`.
The next `for` loop is inside of the other one, and as we saw from earlier, those get multiplied so now our worst-case is `O(n^2)`.
Finally, we have a 3rd `for` loop nested inside the other two, so multiplying this again we now have a **worst-case of `O(n^3)`**. We can skip the `return` statement as we know that's a constant, which we've already determined this far worse than constant time.
Even with all of the initialization, error handling, and matrix math, by simply scanning the function for nesting, we were able to determine very quickly that the time complexity of this function is cubic.
## A bit on why logarithms are special
Remember the example about continuously splitting a deck of cards in half until we reach our target card? This is an example of a logarithm function. **A logarithmic function is the inverse of an exponential function.** That means it tells us "how many times I need to double something to get to some number _n_". Conversely, it also tells us "how many times I need to cut something in half to get down 1."
If you're dealing with a binary search (i.e. choose your own adventure where you either go left or right until you reach a dead end), or you're climbing down the levels of a tree until you hit the leaves, then you know you're dealing with a Big O of `O(log n)`.
The base of the log (meaning base 2 if you have 2 options as in a binary search or base 3 if you have 3 options in a ternary search, and so on) doesn't matter. This is just like how the constant value of a quadratic function doesn't matter in big O analysis. All we care about is the fact that, in general, we're dealing with a logarithmic growth rate for our Big O analysis.
## The next Daily Problem (2-20)
You made it! Big O analysis isn't so bad if you apply some quick scanning tricks. To practice today's work, here's the next Daily Problem:
> Find two functions `f(n)` and `g(n)` that satisfy the following relationship. If no such `f` and `g` exist, write “None.”
>
> (a) f(n) = o(g(n)) and f(n) != Θ(g(n))
>
> (b) f(n) = Θ(g(n)) and f(n) = o(g(n))
>
> (c) f(n) = Θ(g(n)) and f(n) != O(g(n))
>
> (d) f(n) = Ω(g(n)) and f(n) != O(g(n))
## More practice
If you're looking for more [deliberate practice](https://www.redgreencode.com/deliberate-practice-for-software-developers/) with these articles, the _Algorithms_ class we're following includes homework sets and the [first one](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw1.pdf)) just came out today.
Since I'm auditing this course (and so are you by definition if you're reading my articles instead of actually taking the course), I'll include the homework problems that are covered thus far. Once the assignment answer keys are out, only then will I spend any (if at all) time on my answers to the homework problems.
The problems from the homework set that are covered from the first three articles include:
1. 1-17
2. 1-19
3. 1-20
4. 1-22
5. 2-7
6. 2-8
7. 2-19
8. 2-21
9. 2-23
10. 2-24
Happy practicing!
]]>2018-09-04T10:28:00+00:00What is Big O Notation?
https://www.adamconrad.dev/blog/big-o-notation/
Thu, 30 Aug 2018 10:28:00 +0000https://www.adamconrad.dev/blog/big-o-notation/ The _knapsack problem_ is as follows: given a set of integers `S = {s1, s2, . . . , sn}`, and a target number `T`, find a subset of `S` which adds up exactly to `T`. For example, there exists a subset within `S = {1,2,5,9,10}` that adds up to `T = 22` but not `T = 23`.
> Find counterexamples to each of the following algorithms for the knapsack problem.
> That is, give an `S` and `T` where the algorithm does not find a solution which leaves the knapsack completely full, even though a full-knapsack solution exists.
Here's the actual problem, with the answers (and explanations) inlined:
> (a) Put the elements of `S` in the knapsack in left to right order if they fit, i.e. the first-fit algorithm.
**`T = 23; S = {22,21,2}`**. Going from first to last, we add `22`, but at this point there are no other elements that will add up to `23` (even though there does exist a valid solution of `21` and `2`). Our counterexample succeeds!
> (b) Put the elements of `S` in the knapsack from smallest to largest, i.e. the best-fit algorithm.
**`T = 5; S = {1,2,3,5}`**. Going from smallest to largest, we add `1` and then `2`. At this point adding `3` or `5` will put us over `5`, even though obviously choosing just the number `5` would satisfy `T`. Countered again!
> (c) Put the elements of `S` in the knapsack from largest to smallest.
**`T = 23; S = {22,21,2}`**. Same answer as A, and it works here too since they both happen to be ordered from largest to smallest values within `S`.
If any of that doesn't make sense, be sure to [let me know on Twitter](https://twitter.com/theadamconrad).
## Proving an algorithm's efficiency
As we mentioned in the last article, the key to proving efficiency involves **Big O Notation**. What is that exactly? The _O_ specifically refers to the letter commonly used for the name of the function that analyzes an algorithm's _worst case_ (or _upper bound_) on the relative amount of time the algorithm will take to complete (there's also _Ω_ which refers to the _best case_ / _upper bound_ and _Θ_ which refers to the _average case_ / _tight bound_, but in an interview you're only ever going to be quizzed on the worst case, if at all).
I say relative because our _O_ function doesn't return a number such as the number of seconds to complete our algorithm. Instead, it's _relative_ to the number of operations within our algorithm.
## How do we count the number of operations in an algorithm?
We can count operations in an algorithm with 3 simple rules (based on how computer RAM evaluates the cost of computing power):
1. **Add 1 for simple operations**. Adding two numbers (`+`), branching conditionally (`if`), and variable assignment (`var x = 1`).
2. **Add `n` for looping over a set**. If you see `for` or `while` in an algorithm, it's over some `n` number of items, so in the worst case, you need to consider the posssibility that you have to iterate over all `n` items.
3. **Whatever your calculations were for 1 and 2, multiply them if the algorithm is called again recursively**. For example, to calculate the fibonacci sequence on `n` items, each recursive call of fibonacci requires finding the previous value of the fibonacci sequence. If each call requires twice the number of calls, that's exponential (`2^n` specifically) number of calls to fibonacci!
But here's the thing, calculating the _precise_ number of operations for an algorithm is unnecessarily cumbersome. Therefore, **when we refer to _Big O Notation_, we're calculating the time complexity of an algorithm with the multiplicative constants ignored and the lower factors ignored**. So if you found out your algorithm ran in precisely `2n^3 + 4n^2 + 14763n + 1209489` operations, our Big O analysis can reduce this function all the way down to `O(n^3)`.
## Patterns for identifying algorithmic efficiency
So now that we know how to evaluate an algorithm's Big O worst case performance, what do we make of it's efficiency? Let's look at each of the major patterns you'll see for your Big O analysis:
### Constant - O(1)
These are the fastest algorithms because they are comprised of simple operations. Your classic addition function, `function add(x,y) { return x + y }` runs in constant time, it's just the `+` operation! In the real world, these can run on any machine, slow or fast, with essentially any kind of input (provided it actually works), as these will always be able to run efficiently.
### Logarithmic - O(log n)
Imagine you have a deck of cards. Cut the deck in half. Now cut that half of a deck in half. Now cut that portion in half. Now cut _that_ portion… okay you get the idea. If you plotted that on a graph, it would look just like the [`log(n)`](https://en.wikipedia.org/wiki/Logarithm).
If you have an algorithm that is dividing the problem space by some number (like by 2 if you're doing a _binary_ search on a tree), you can bet that algorithm runs in logarithmic time. Obviously this function is now dependent on `n` so it's going to be slower than constant-time functions, but these are so fast on modern machines that you can basically feed it any `n` number of items and it will run just fine.
### Linear - O(n)
Loop over each item in an integer array and print out the number. How many items do you have to print? For an array of 3 numbers, you have to count 3. For 5, it's 5. For `n` it's, well, `n`! This is why we say loops like `for` and `while` evaluate in `O(n)` since we have to touch, at worst, all `n` items.
These algorithms still run pretty fast. Given that CPUs can now run millions of operations per second, it's safe to say a `n` of billions of items is still pretty fast in the real world.
### Superlinear - O(n log n)
How would you get an algorithm like this? Again, imagine a deck of cards. I want you to find a card in the deck quickly. Even worse: I shuffled the deck. But you can find it pretty quickly if you follow these steps: Cut the deck in half until you just have pairs of cards. Sort the cards for each of the pairs (pretty easy to compare if one card is smaller than the other, right?). Now that all of the pairs are set, combine two pairs and repeat the process. Eventually you're going to have the whole deck sorted to easily find your card.
Continuously cutting a deck in half sounds very much like our example where the answer was `O(log n)`.
Looking at the whole deck? Sounds like `O(n)`.
And given that we have to look at all of the cards for each _layer_ that we have to construct, that's `n` number of times we have to make `log n` partitions. Hence, `n log n`! Again, since this is just slightly larger than a linear algorithm, we can run an algorithm like this very quickly on even billions of items.
### Quadratic - O(n^2)
You've got 5 numbers. You need to multiple each number by every number (including itself). So the first number is multiplied 5 times, the second number is multiplied 5 times, the third is too, and the fourth, and the fifth. 5 numbers multiplied 5 times is 25, which is the same is 5^2. So if we generalize this for `n` numbers to multiply by every other number, you've got your `O(n^2)` algorithm.
This is where time needs to start being considered. For tens of thousands of items, we're still moving pretty quickly with generally under a billion operations. Once you get above a million items for this kind of algorithm, we're looking at _trillions_ of total operations to be run. Even your MacBook Pro can't count all of Apple's money in an instant; it's a pretty big number!
### Exponential - O(2^n)
Remember the fibonacci sequence from earlier? A classic example of an exponential algorithm (hint: we can do _much_ better but the naive implementation is exponential). When you have to enumerate all subsets of `n` items, you're dealing with something exponentially painful. And we're talking _real_ painful. With quadratic algorithms we could work with a million numbers. But exponential? Just _40_ items gets us back up to over a trillion. _Oof_.
### Factorial - O(n!)
Arrange a deck of cards from smallest to largest. That's one configuration. Now arrange it from largest to smallest. That's another configuration. Now shuffle the deck. Yet another configuration. Now organize that deck where it's from smallest to largest but order each card where hearts come first, diamonds come second, clubs come third, and spades come last in a given number. These configurations are called _permutations_ of the deck. And in fact finding the number of permutations for _n_ number of items takes `O(n!)` amount of time. So to find all permutations of a standard 52 deck of cards is `52!`.
Do you just how _massive_ `52!` is?
There are more permutations of a deck of cards than there are [_atoms_ on this earth](https://gizmodo.com/there-are-more-ways-to-arrange-a-deck-of-cards-than-ato-1553612843).
You could shuffle a deck of cards every second _since the universe has started_ and still not see every permutation of a deck of cards.
In other words, don't write factorial algorithms. Beyond basically 20 items, it's not only painful, it's essentially _impossible_. Modern computers simply can't handle that number of operations. So if you see your algorithm is pemutating over all of your items, and you have more than a handful of items, you should probably look for another solution to your problem.
## Wrapping up with the next Daily Problem
So now that you understand about Big O Notation and the kinds of bounds you should be striving for, let's put that to the test:
> For each of the following pairs of functions `f(n)` and `g(n)`, state whether `f(n) = O(g(n))`, `f(n) = Ω(g(n))`, `f(n) = Θ(g(n))`, or none of the above.
>
> (a) `f(n) = n^2 + 3n + 4, g(n)= 6n + 7`
>
> (b) `f(n) = n√n, g(n) = n^2 − n`
>
> (c) `f(n) = 2^n − n^2, g(n) = n^4 + n^2`
Think you've got your answers? Good luck, and we'll see you next week!
]]>2018-08-30T10:28:00+00:00How to design an algorithm
https://www.adamconrad.dev/blog/how-to-design-an-algorithm/
Tue, 28 Aug 2018 10:28:00 +0000https://www.adamconrad.dev/blog/how-to-design-an-algorithm/ The _knapsack problem_ is as follows: given a set of integers `S = {s1, s2, . . . , sn}`, and a target number `T`, find a subset of `S` which adds up exactly to `T`. For example, there exists a subset within `S = {1,2,5,9,10}` that adds up to `T = 22` but not `T = 23`.
>
> Find counterexamples to each of the following algorithms for the knapsack problem.
>
> That is, give an `S` and `T` where the algorithm does not find a solution which leaves the knapsack completely full, even though a full-knapsack solution exists.
>
> (a) Put the elements of `S` in the knapsack in left to right order if they fit, i.e. the first-fit algorithm.
>
> (b) Put the elements of `S` in the knapsack from smallest to largest, i.e. the best-fit algorithm.
>
> (c) Put the elements of `S` in the knapsack from largest to smallest.
My plan is to solve the problem before the lecture, throw it up in the next article in the series to start off the article, see if it lines up with the answer given in the lecture, and then compare and contrast the different approaches as well as discuss any flaws or issues that arose along the way.
## See you next time!
Wow, this is probably one of the longest posts I've ever written. I hope this post has convinced you to finally learn data structures and algorithms once and for all. Even if you aren't dedicated to taking the entire course, I hope this stands to be a reference guide if there are ever parts or pieces of this that you might want to reference or check when working with these things in JavaScript.
]]>2018-08-28T10:28:00+00:00Why hiring is broken and how I'm dealing with it
https://www.adamconrad.dev/blog/why-hiring-is-broken-and-how-im-dealing-with-it/
Tue, 28 Aug 2018 10:28:00 +0000https://www.adamconrad.dev/blog/why-hiring-is-broken-and-how-im-dealing-with-it/2018-08-28T10:28:00+00:00Mastering the reduce function
https://www.adamconrad.dev/blog/mastering-the-reduce-function/
Wed, 18 Jul 2018 01:27:00 +0000https://www.adamconrad.dev/blog/mastering-the-reduce-function/ sum + n);
```
This is a perfectly valid example, but you might be wondering: How do you reduce an array of strings? _Why_ would you reduce an array of strings? What are some real-world examples of using reduce to help you in your day-to-day work? We're going to attempt to answer those in this article so you can master the reduce function and finally feel comfortable using it.
## Definition and breakdown
Reduce's job is to take a collection of items and _reduce_ them down to one singular value. The [type signature](https://hackernoon.com/function-type-signatures-in-javascript-5c698c1e9801) of this function looks like this:
```
reduce: (callback: function, initialValue?: any) => any
callback: (accumulator: any, currentValue: any, currentIndex?: number, array?: array) => any
```
We are using [TypeScript's function type annotation syntax](http://2ality.com/2018/04/type-notation-typescript.html#function-types) for this example. In short, it just helps decode the inputs and the outputs. So our function takes in two arguments we've called **`callback`** and **`initialValue`**. What are those arguments?
### `callback` argument and its type signature
**`callback`** is a `function` that takes four arguments itself: **`accumulator`**, **`currentValue`**, **`currentIndex`**, and **`array`**.
**`accumulator`** is of the type of thing you're trying to make your `outputValue`. So if you're reducing an array of `number`s through addition, the `accumulator` is a `number`. It is _accumulating_, or growing, in size **not by the length of the array but by the "growth" of the eventual output value**. In fact, there is an inverted relationship by how the accumulator grows against how the array is reduced. This object is definitely the most confusing, so if this still doesn't make sense, we'll break this down in the later examples.
**`currentValue`** is an item in the array you're trying to reduce. As you iterate over the array, your `currentValue` changes as you move from the first to the last element in that array.
**`currentIndex`** is an _optional_ `number` argument that coincides with the `currentValue` by expressing that value's index. So as you are iterating over your initial array, the `currentIndex` will grow from `0` up to the `n-1` number of elements in your array.
**`array`** is an _optional_ `Array` argument that is the entire input array you provided to the `reduce` function. This might seem like an odd thing to pass in but the key here is that the input array is **not modified**. Functional methods obey _immutability_, meaning they do not modify the array object they are provided. Therefore, regardless of what you do in your `reduce` method, you always have access to your input array because it is never changed throughout the execution of `reduce`.
### `initialValue` argument
**`initialValue`** is an _optional_ argument that provides the initial value for `accumulator` to begin reducing. For example, if you wanted to reduce `[1,2,3]` into the summation of those values, you'd want the `initialValue` to be set to `0` to ensure you are adding against the same type (a `number`) and that this value does not pollute the values that are being manipulated by the array. If you do not set this argument, the `accumulator` will be set to the first value of the array. In the above example, that would be `1`.
### `outputValue` return object
**`outputValue`** is the final form of the `accumulator` object once the entire `array` has been iterated. Unlike most array functional methods, which return arrays, this will return the type that you have built throughout the iteration. This is _not necessarily_ the same type as the values within your input array, as we will see below.
## Examples
Now that we know what goes in and out of the `reduce` function, let's dive into a few examples, step-by-step, in increasing difficulty.
### Easy - sum elements of an array
As we mentioned earlier, the ever-present example of reducing arrays is to sum a list of numbers through addition. It can be succinctly expressed like this:
```js
[1, 2, 3].reduce((sum, n) => sum + n);
```
The easiest way to see how the iteration is working is to throw log statements at each step for our two input arguments:
```js
[1,2,3].reduce((a, c) => {
console.log(`accumulator: ${a}`);
console.log(`currentValue: ${c}`);
return a + c;
});
// FIRST ITERATION
// accumulator: 1
// currentValue: 2
// SECOND ITERATION
// accumulator: 3
// currentValue: 3
// FINAL ITERATION (AS RETURN VALUE)
// 6
```
On the first iteration, we grab the first element in our array, `1`, and set that as our `accumulator` value. We then add our `currentValue`, which is the second element in the array, `2`, onto our `accumulator`, because that is the logic we have decided on for our `callback` function. 1+2 equals 3, so `3` therefore becomes our new `accumulator` value in our second iteration. Since no `initialValue` is set, our second iteration is actually operating on the third and final element, `3`, which is added on to our new `accumulator` value of `3`. 3+3 equals 6, and with no elements left to iterate on in our array, we return the final `accumulator` value of `6`.
Easy! Now, what happens if we operate on another kind of type, like a `string`?
### Medium - simple string builder
One of the most common things I see in JavaScript code involves building strings like this:
```js
const myBuddies = ['James', 'Darien', 'Eric'];
let greeting = 'Greetings friends! ';
for (let i = 0, l = myBuddies.length; i < l; i++) {
greeting += myBuddies[i] + ' is my buddy. ';
}
greeting += 'These are all of my buddies!';
return greeting;
```
You start with an array and an empty string. You build the string up from the array elements, maybe adding in some additional information, and then you return the new string, fully formed and populated. Now look what happens when I rearrange and rename some variables, do you see the connection?
```js
const array = [some, array, values, ...rest];
const initialValue = 'Greetings friends! ';
let accumulator = typeof initialValue === "undefined" ?
initialValue :
array[0];
const manipulation = (a, b) => a.concat(b);
let i = (typeof initialValue === "undefined" ? 0 : 1);
let l = arr.length;
for (i, l; i < l: i++) {
accumulator = manipulation(accumulator, array[i]);
}
return accumulator;
```
This iterative pattern _is the `reduce` function_. The `callback` within `reduce` is simply some function of `manipulation` of data, iterated across an `array` and saved into some built value called an `accumulator`! The difference is, our last example is 8 lines of code, and to make this in `reduce`, we can express it in a very clean and terse way, with zero local variables needed:
```js
['James', 'Darien', 'Eric']
.reduce((paragraph, name) => {
return `${paragraph}${name} is my buddy. `;
}, 'Greetings friends! ')
.concat('These are all of my buddies!');
```
This is pretty neat. We were able to remove all local variables and reduce the line count by more than half. We are also able to chain functions together to continue to tack on unique statements at the end of our string builder.
While this instance is a bit more involved and doesn't simply manipulate numbers, at the end of the day, this is still a form of summation, which may seem a bit too simplistic and similar to the first example. Lucky for you, we've got one last example that should be a bit more advanced and is not simply a form of addition.
### Hard - flatten array to dictionary
Our last example is not going to simply be some form of summation. Instead, we're going to build a dictionary (also known as a hash table, which is simply an `Object` in JavaScript) from an array of arrays.
We're going to make the first value in our sub-array be the keys, and the rest of our values in our sub-array be the values. If there are hash collisions (meaning multiple sub-arrays with the same value in the first element), we simply add the remaining sub-array items onto the existing key/value pairing.
How are we possibly going to do this with `reduce`?
We know our initial value is `{}`. We aren't doing a summation (or concatenation if we're dealing with strings), but we are still trying to manipulate our output value, which we know to be an `Object` as a hash table. How can we add things onto a hash table in JavaScript? Square bracket notation! That should be enough to get us what we want:
```js
const arrayOfArrays = [
['a', 'ace', 'apple', 'axe'],
['b', 'baseball', 'boy', 'bye'],
['c', 'cat', 'cold'],
['a', 'azure'],
['a', 'azure']
];
const dictionary = arrayOfArrays.reduce((dict, page) => {
const [letter, ...words] = page;
dict[letter] = dict.hasOwnProperty(letter) ?
dict[letter].concat(
words.filter(word => !dict[letter].includes(word))
) :
words;
return dict;
}, {});
dictionary['a'][0];
// 'ace'
dictionary['a'][dictionary['a'].length - 1];
// 'azure'
dictionary['a'].filter(word => word === 'azure').length
// 1
```
Lots to unpack here. To start, we created our multi-dimensional array called `arrayOfArrays`. Each row in the array starts with the letter of our dictionary. That letter will represent the various pages in our dictionary. You could do this without declaring the local variable, but for the sake of readability within this blog, I chose to separate them.
Next, we construct our actual dictionary. We start with our initial `Object`, a simple empty `{}` hash. Our `callback` represents two things, the `dict` dictionary object that we are constructing, and the `page`, which represents a row in our `arrayOfArrays`. That row has two kinds of strings: the letter representing what class of words we're on, and those words listed on that page.
In languages like Haskell, lists come with functions called `head` and `tail` (also known as `car` and `cdr` in older languages like Lisp). In JavaScript, these don't come out of the box, but thankfully in ES6 we can quickly grab these through the destructuring pattern `[first, ...rest] = array`. We do exactly this to grab our `letter` and corresponding `words`.
Next, we have to build our dictionary. There are two major considerations here: new pages and existing pages.
If we have a new page, we are dealing with the simpler case. All we have to do is populate the page (whose key is `letter`) with our `words`.
If we are dealing with an existing page, we want to add onto that existing page using `concat`. But we can't simply concatenate our list of `words`. What if we've already added a word to our dictionary? That's where our functional cousin `filter` comes in to filter out words that are already included on this page. The filter will return only novel `words`, removing duplicates along the way.
Finally, we test out our new `dictionary` variable to prove our dictionary is complete, can handle adding words late to our prepopulated keys, and will gracefully handle duplicates.
## Final thoughts
Congratulations! You now know how to reduce an array down to a composite value. You've not only mastered the reduce function in JavaScript but in every other language.
As we mentioned in the beginning, the concepts behind `reduce` are the same across languages like Clojure and Haskell, but with different names like `fold` or `inject`. You can also now reason about variants on `reduce`, such as `foldr` (i.e. fold right, which is the same as `reduce`), or `foldl` (i.e. fold left, similar to `reduce` except the iteration happens from the back to the front of the array).
If you're still confused, I've failed you. But fear not; find me on Twitter and I'd be happy to provide even more examples. I will not rest until everyone reading this knows the power of `reduce`!
]]>2018-07-18T01:27:00+00:00How to utilize VR on the web with A-Frame
https://www.adamconrad.dev/blog/how-to-utilize-vr-on-the-web-with-a-frame/
Fri, 29 Jun 2018 09:25:00 +0000https://www.adamconrad.dev/blog/how-to-utilize-vr-on-the-web-with-a-frame/ tag of a standard HTML file and you’re ready to get started! The entire framework is contained within this minified file, just like how you would program a web app with jQuery or React.
If you’re wondering why you insert this script in the top of the file, it’s because the VR environment cannot be loaded without the framework. With most JavaScript source files, it’s beneficial to load the DOM first to prevent the scripts from blocking the content you want to show. In this case, the framework creates its own VR DOM and must be loaded to make use of any of the tags you need to create a VR application on the web.
## Hello (Virtual) World
![](https://cdn-images-1.medium.com/max/2000/0*XKMbJv5AZpGWusxi)
So how do we work with A-Frame? It’s nearly as easy to build an A-Frame application as it is to install it:
Yup, that’s really all there is to it. Just insert that into the tag of that same HTML file and you have a WebVR application! You can view our extremely simple demo app [here](https://adamconrad.dev/aframe-demo.html) with a mobile phone using Google Chrome. Impressed yet? Here’s how it works:
The is like the tag or for the WebVR world. It is the container that isolates all of the 3D components and acts as a signal to the device that it needs to render this webpage in VR mode. From there, you can insert a whole set of components called primitives.
For this app, we inserted a simple [primitive](https://aframe.io/docs/0.8.0/primitives/a-text.html). We add the properties of our primitive just like our standard HTML tags, using inline attributes. As you might guess, the value attribute tells A-Frame what text to render into the primitive.
Since we are operating in a 3D world, we need to project this text onto something. Now that we are no longer dealing with a static surface like the tag, we need to tell A-Frame how we want to interact with this text primitive. The geometry provides a way to interact with text with something like a cursor by stating how we are projecting the text. In this case, we want to project our text onto another primitive component, the , which is simply referred to as a subset of primitives using :plane.
So how do all of these primitives work together? That’s where the Entity-Component-System architecture (ECS) comes in.
## A Brief Intro on the Entity-Component-System Architecture
![](https://cdn-images-1.medium.com/max/2000/0*K7UyFSMafpQI5FFj)
ECS is the architecture of choice for designing games and 3D environments. This architecture is heavily utilized in Unity, the popular 3D engine gaming framework. A-Frame is an ECS framework built on top of three.js, with specific support for VR devices. So let’s break down each of these words to understand the architecture:
**Entity** refers to house component objects in a scene. These are basically
s that are named based on the kind of component you are using. For example, is the entity that houses our “Hello world!” text. All A-Frame primitives are some form of entities that combine a subset of components.
**Component **is the reusable chunk of data that gets added to entities to bring them to life. So if is the HTML entity we add to our site, the value and geometry components are like the CSS to turn that entity into something meaningful.
**System** is the management system that ties components together across all of the entities they belong to within a scene. So if we again refer to the geometry component, the system for that component is what allows geometry to be shared by , , and primitives, controlling the shapes of the objects and how those shapes appear in space in relation to one another.
Understanding ECS is vital to conceptualizing the A-Frame API because A-Frame is essentially a giant collection of starter entities and components organized by various systems. You can then use these systems to build more complex and engaging components or entities to make your own environments richer and more interactive.
## Next Steps
We’re barely scratching the surface on what VR can do today on the web. Even though VR itself is a brand new and growing platform, truly immersive web experiences are already possible with browsers like [Supermedium](https://supermedium.com/). These VR-first browsers provide a platform to showcase the power of frameworks like A-Frame.
If this article piqued your interest in really diving into WebVR and A-Frame, A-Frame provides a guided tutorial series called [A-Frame School](https://aframe.io/aframe-school/#/1) to walk you through building your first WebVR application.
Once you think you’ve got something unique to share, there are a few places to promote your work. Supermedium offers a [directory](https://webvr.directory/) for compelling WebVR projects, as well as sites such as [WITHIN](https://vr.with.in/).
VR right now is very similar to the early days of the web: fun, expressive, and weird. There’s no need to try to aim for professional or production quality. Rather, we’re in the nascent stage where expression and creativity thrive, pushing people to test the limits of what is capable while the technology is still being fleshed out. There’s no better time to get ahead of the curve with the most bleeding-edge user interface than in AR/VR on the web with A-Frame.
*Originally published at [softwareengineeringdaily.com](https://softwareengineeringdaily.com/2018/06/29/how-to-get-started-in-vr-with-a-frame/) on June 29, 2018.*
]]>
2018-06-29T09:25:00+00:00What you need to know about WCAG 2.1
https://www.adamconrad.dev/blog/what-you-need-to-know-about-wcag-21/
Wed, 27 Jun 2018 14:11:00 +0000https://www.adamconrad.dev/blog/what-you-need-to-know-about-wcag-21/2.1.4 Character Key Shortcuts (A)
* 2.5.1 Pointer Gestures (A)
* 2.5.2 Pointer Cancellation (A)
* 2.5.3 Label in Name (A)
* 2.5.4 Motion Actuation (A)
* 1.3.4 Orientation (AA)
* 1.3.5 Identify Input Purpose (AA)
* 1.4.10 Reflow (AA)
* 1.4.11 Non-text Contrast (AA)
* 1.4.12 Text Spacing (AA)
* 1.4.13 Content on Hover or Focus (AA)
* 4.1.3 Status Messages (AA)
* 1.3.6 Identify Purpose (AAA)
* 2.2.6 Timeouts (AAA)
* 2.3.3 Animation from Interactions (AAA)
* 2.5.5 Target Size (AAA)
* 2.5.6 Concurrent Input Mechanisms (AAA)
## What do the different conformance levels (A, AA, AAA) mean?
As seen in the WCAG 2.1 document, there are a large number of recommendations for accessibility. It would be unrealistic to expect site owners to adhere to them all, so **each success criterion is given a priority for how important it is to satisfy**.
![Accessibility is important](/assets/images/blur-codes-coding-577585.jpg)
Level A criteria are the most important, followed by AA, and then finally AAA criteria. Each level builds on the other, so it is necessary to satisfy all of the Level A requirements first before trying to satisfy AA or AAA requirements. This is because the W3C [requires full conformance at each level to be recognized as compliant](https://www.w3.org/TR/UNDERSTANDING-WCAG20/conformance.html#uc-conf-req1-head).
> For Level AA conformance, the Web page satisfies **all the Level A and Level AA Success Criteria**, or a Level AA conforming alternate version is provided.
Each section starts with Level A requirements, and builds up to Level AAA, with each requirement adding onto the previous one. For example, Section 3.3 is on input assistance. 3.3.1 and 3.3.2 are Level A requirements. 3.3.3 builds on 3.3.1, which is Level AA. The highest level, 3.3.5 and 3.3.6, base success around much more involved criteria, and thus achieve a rating of Level AAA.
## What conformance levels do I need to pass?
The obvious answer would be to aim for AAA compliance, which is the strictest and most complete standard. The Level AAA standard will create an environment that maximizes your audience reach by providing support for the widest audience of disabled persons.
Practically speaking, **aim for Level AA conformance**. This is the highest realistic conformance level to achieve. The W3C recommends this level as well:
> It is not recommended that Level AAA conformance be required as a general policy for entire sites because it is not possible to satisfy all Level AAA Success Criteria for some content.
One example of Level AAA conformance that cannot be satisfied is [WCAG criterion 2.1.3 for keyboard shortcuts](https://www.w3.org/TR/WCAG21/#keyboard-no-exception). If your application is only available on mobile, such as the Uber app, there is no real keyboard, so it cannot be navigated by keyboard shortcuts, and will not pass this success metric. This, of course, is not a practical requirement because Uber is designed for use on mobile and should not have to try and cater their strategy towards a requirement like this.
![Keyboard events](/assets/images/keyboard-events.jpg)
Now that we have an idea of what level we need to aim for, here are the new success criteria in the updated guidelines and how to pass them.
## New WCAG success criteria
### Level A: The bare minimum
#### 2.1.4 Character Key Shortcuts (source)
##### What it means
Keyboard shortcuts aren't just for power users like vim developers. They are also super useful for the blind, who have no reason for using a monitor or mouse. Therefore, it is a good idea to implement some basic keyboard shortcuts to navigate your site without the need for a mouse.
##### How to pass
* **If you have shortcuts, make sure you can disable them**
* **The shortcut can be mapped to a different key than you provided by default**
* **The shortcut can only be used on a component on in focus.** You can't accidentally trigger the shortcut if it isn't actively on the part of the page that should receive that shortcut.
#### 2.5.1 Pointer Gestures (source)
##### What it means
Things like touch gestures have many ways of interacting. Single point gestures include tapping or clicking. Multipoint gestures include things like pinch-to-zoom. Path-based gestures include things like swiping or sliding. Your application should account for all of these.
![touch gestures need fallbacks as well](/assets/images/touch-interaction.jpg)
##### How to pass
If you use multipoint or path-based gestures, the **actions that are triggered by those gestures should have a single point fallback**. In other words, if you use the swipe event to mark something as complete, you should also have a touch or click gesture to mark that item complete as well.
#### 2.5.2 Pointer Cancellation (source)
##### What it means
Actions can be triggered with, for example, a click. Those actions should be reversible.
##### How to pass
* **Don't use the down-event** (e.g. `keydown` or `mousepress`) unless it is essential
* **Show a popup or alert to undo the previous action**
* **If you use a down-event, an equivalent up-event should be available to undo that action**
#### 2.5.3 Label in Name (source)
##### What it means
Images can be used inside of a label to describe content. The component housing those images should have a text fallback of that image content.
##### How to pass
* **If you use an image to label something, the naming attribute should have the text of that image or a description of that image as a fallback.** [Accessible name and description mappings](https://www.w3.org/TR/accname-1.1/#accessible-name-and-description-mapping) include attributes like `alt` for `` tags, [captions, transcriptions, and descriptions](https://webaim.org/techniques/captions/) for `2018-06-27T14:11:00+00:00How to find excellent refactoring opportunities
https://www.adamconrad.dev/blog/how-to-find-excellent-refactoring-opportunities/
Thu, 14 Jun 2018 21:13:00 +0000https://www.adamconrad.dev/blog/how-to-find-excellent-refactoring-opportunities/
# An example in refactoring
Here's the (obfuscated) code I was working with to begin our journey:
```javascript
const addEvents = (events, label, segment) => events.push([label, segment]);
const main = () => {
const [upcoming, past] = partition(
arr, date => today.diff(date, 'day') <= 0
);
const events = []
if (upcoming.length > 0) {
addEvents(events, 'upcoming stuff', upcoming);
}
if (past.length > 0) {
const byMonth = groupBy(past, date =>
moment(date).format('YYYY-MM');
);
const months = Object.keys(byMonth).sort().reverse();
months.forEach((month, idx) => addEvents(events, `${idx} months ago`, month);
}
return events;
}
```
This code is pretty simple - I've got two arrays of dates: upcoming dates, and past dates (including today). I just need to let the world know what's coming up and what is in the past, so it's not a simple push to an array because each chunk is split off by month (except for the future, which is just labeled future stuff), and each section is really a tuple of label and date.
So what immediately got me tweakin' on this code?
## Single Responsibility Principle
By now the SOLID acronym is engrained in my brain. If you don't know what it is, thoughtbot wrote a [great intro about what SOLID is](https://robots.thoughtbot.com/back-to-basics-solid) so definitely check that out.
The most obvious one that people seem to remember is the first one, Single Responsibility Principle. It's probably the easiest because the definition really says it all. Your class/function/whatever should do one thing and do it well. In this case, I see a bunch of stuff, so my first thought is to break this function up into multiple functions.
How do I know how to break it up? Stuff like `if` statements are literal blocks of code. Since everything is essentially scoped to blocks in JavaScript, anything inside of a bracket is a good candidate to break out into its own function.
I'm also going to cheat and skip a step, because `if` statements also provide another subtle hint: **the `if` keyword can usually be refactored into a guard clause.** Why? Because it is already guarding some code against being run within your function.
If we rewrite the `if` in guard clause notation (meaning, exit early before the function has a chance to run), along with the SRP refactoring, we get:
```javascript
const addEvents = (events, label, segment) => events.push([label, segment]);
const addUpcomingEvents = (events, upcoming) => {
if (upcoming.length === 0) return;
addEvents(events, 'upcoming stuff', upcoming);
};
const getPastMonths = past => {
return groupBy(past, date => moment(date).format('YYYY-MM'));
};
const addPastEvents = (events, past) => {
if (past.length === 0) return;
const months = Object.keys(getPastMonths(past)).sort().reverse();
months.forEach((month, idx) => addEvents(events, `${idx} months ago`, month);
};
const main = () => {
const [upcoming, past] = partition(
arr, date => today.diff(date, 'day') <= 0
);
const events = [];
addUpcomingEvents(events, upcoming);
addPastEvents(events, past);
return events;
}
```
## Mutation is ugly
Already this looks way better. `main()`'s single responsibility is populating the array of `events` and returning them. Each extracted function has one responsibility based on its function definition. So I guess we can call it a day, right? The next smell I see is this pattern:
```javascript
const myPoorFriendArray = [];
// a world of hurt, warping our pristine, empty friend
blackBoxFunction(myPoorFriendArray);
moreTorture(myPoorFriendArray);
theGauntlet(myPoorFriendArray);
// we chew him up and spit him out, never knowing the horrors he experienced
return myPoorFriendArray;
```
Ya feel me on this one? Our boy `myPoorFriendArray` is now an honorary X-Man: mutated from his original, empty form, into some chimera of various parts of the code. [Kanye said it better than I can](https://www.youtube.com/watch?v=fSJoDuU328k), but this is the stuff I _DON'T LIKE_.
Maybe you come from an OO world like Java and mutation is just a regular, everyday occurrence. If you're woke and you come from a functional background like Elm, you can see that mutants are bad. We don't want no X-Men fighting for us on the coding battlegrounds (sorry Gambit, love you).
So what's the fix here?
1. **Send the empty array in as an argument to our first add function**
2. **The function will take in the empty array and make a copy**
3. **The copy is filled out and returned**
The argument is not mutated, and we remove state dependencies. If you've done any functional programming, terms like _stateless_ and _immutability_ are everyday terms, and that's what we're striving for.
Our stateless code doesn't have [side effects](https://hackernoon.com/code-smell-side-effects-of-death-31c052327b8b), a nasty code smell that makes bugs more difficult to track down because state persists across functions. Without state, what you put into it is what you get out of it each and every time. Making your functions predictable, and thus less likely to include errors you didn't test for.
Here's what this change looks like:
```javascript
const addEvents = (events, label, segment) => events.push([label, segment]);
// we didn't touch the middle functions...
const main = () => {
const [upcoming, past] = partition(
arr, date => today.diff(date, 'day') <= 0
);
return addPastEvents(addUpcomingEvents([], upcoming), past);
}
```
Now we're getting somewhere! But for the savvy reader, something should still seem off. Can you see what's still wrong?
## Negate the state
**We didn't really remove the state from the add functions.** Yeah, we gave it a starting point in our `main()` function, but the `addEvents()` function is still adding events to a singular `events` array which is getting passed around like a rag doll. How do I know this?
`Array.prototype.push` is a mutable function. It provides an interface to add stuff to an array and returns the number of elements in the enlarged array. How do we do this in an immutable way?
`Array.prototype.concat` is the immutable equivalent of `push()`. It takes an array, combines it with another array, and those two are combined into a newly-allocated array. So we can modify the above slightly to be:
```javascript
const addEvents = (events, label, segment) => {
if (!label) return events;
return events.concat([[label, segment]]);
};
```
Now we never touch our inputs. `events` remains as pristine as it was the day it arrived in this humble function, and instead we use `concat()` to combine `events` with the new array we've made, which is the array of our tuple (a point of clarification: tuples don't exist in JavaScript, but I'm using that term to describe our 2-item array to make the distinction clearer here).
Now I'm in audit mode: where else do I need to follow this pattern of not touching the arguments and ensuring I return an `Array` type?
## Provide type safety
If you aren't using TypeScript, this is a great way to practice some proactive type safety to ensure that your function returns objects of the same type regardless of the output. That means don't return an `Array` if you have stuff, but return `undefined` or `null` if you exit early (like in a guard clause). Oh crap, I'm doing that. Looks like another opportunity to refactor!
```javascript
// unchanged but illustrating that all functions, including this one, always return an Array
const addEvents = (events, label, segment) => {
if (!label) return events;
return events.concat([[label, segment]]);
};
const addUpcomingEvents = (events, upcoming) => {
if (upcoming.length === 0) return events;
return addEvents(events, 'upcoming stuff', upcoming);
};
const getPastMonths = past => {
return Object.keys(groupBy(past, date => moment(date).format('YYYY-MM'))).sort().reverse();
};
const addPastEvents = (events, past) => {
if (past.length === 0) return events;
return getPastMonths(past).reduce((acc, month, idx) => {
return addEvents(acc, `${idx} months ago`, month), events);
};
};
```
Lots to unpack here. So in `addUpcomingEvents()` all we did was make sure we return an `Array` and not `undefined` (ending with just `return;` is shorthand for returning an `undefined` object). We do this because `concat()` returns an array, so we want to make sure all `return` statements provide the same type of object in any given function.
Next I did some refactoring of `getPastMonths()` to handle the sorting an reversing, because the `groupBy` function _technically_ returns an `Object`, and a way for it to return an `Array` is to grab the `Keys` (which is an `Array`) and do our necessary transformations to that array object.
Finally, `addPastEvents()` starts out the same as the upcoming function by ensuring our guard clause returns an `Array` type. The next part is a bit wilder. Originally we were taking an array and iterating over it using `Array.prototype.forEach`. The problem is that this iterator doesn't return an `Array` like we need it to. It simply gives us a platform to view every item in our array.
We also know that in the end, we want one array object that adds in all of the past events. **When you think about needing to combine things into one and return that combined object, think of using [`Array.prototype.reduce`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Reduce).**
In this case, I knew I needed to add the events (by month) into our `events` array, _and_ return that newly combined array, using `events` as the starting point. The reduce function takes two arguments, a `callback` on how to combine stuff, and an optional `initial` object to begin with.
The `callback` is probably the most confusing part - what is this nested `return` trying to do? The `reduce()` `callback` argument has two arguments of its own: an `accumulator` object, which is initialized to our `initial` object (if you leave it out, it defaults to `undefined`), and the `current` item that is being iterated on. There are two additional optional arguments: the `index` of the item you are iterating on, and the original `array` object you called `reduce()` on. Since we need the `index` argument to label our months, I added that in.
So with all of that said, the `reduce()` function is basically saying:
> For each past month in my array, add it (with its label) onto my accumulated array, which is the cumulation of every previous iteration, starting with my initial `events` array.
## The final result
![Refactored code is good code](/assets/images/blur-close-up-code-546819.jpg)
It was at this point I called it a day and was satisfied with my refactoring for code review I mentioned in the beginning. The final product looks like this:
```javascript
const addEvents = (events, label, segment) => {
if (!label) return events;
return events.concat([[label, segment]]);
};
const addUpcomingEvents = (events, upcoming) => {
if (upcoming.length === 0) return events;
return addEvents(events, 'upcoming stuff', upcoming);
};
const getPastMonths = past => {
return Object.keys(groupBy(past, date => moment(date).format('YYYY-MM'))).sort().reverse();
};
const addPastEvents = (events, past) => {
if (past.length === 0) return events;
return getPastMonths(past).reduce((acc, month, idx) => {
return addEvents(acc, `${idx} months ago`, month), events);
};
};
const main = () => {
const [upcoming, past] = partition(arr, date => today.diff(date, 'day') <= 0);
return addPastEvents(addUpcomingEvents([], upcoming), past);
}
```
To summarize, the major things we covered in this refactoring are:
1. **Simplify functions down to a single responsibility**
2. **Eliminate side effects by using immutable functions and not modifying arguments**
3. **Ensure type safety by verifying all return statements provide the same type of object**
This gives us code that is easier to read, less prone to bugs, and easier to test.
Is there anything else you would refactor? Did I miss something? [Hit me up on Twitter](http://twitter.com/theadamconrad) and let me know what else could make this code better.
]]>2018-06-14T21:13:00+00:00If you're really worried about the GitHub acquisition, here's what to look out for
https://www.adamconrad.dev/blog/if-youre-really-worried-about-the-github-acquisition-heres-what-to-look-out-for/
Mon, 04 Jun 2018 21:13:00 +0000https://www.adamconrad.dev/blog/if-youre-really-worried-about-the-github-acquisition-heres-what-to-look-out-for/ You might have heard that at the end of last year we became a part of Microsoft, but our services are still provided under a separate Terms of Service. We’re excited about leveraging Microsoft technology and resources to offer more valuable features and services to you.
And I've been on LinkedIn for the last 2 years. I'm neither frustrated nor thrilled. LinkedIn is just as much of a recruiter Zombieland as it has always been. You don't have to trust Microsoft when they say they don't want to change GitHub, but they are not offering any indication that they want to mess with it.
# Myth: GitLab and Bitbucket are safer options
The other issue I see is that if there is an unwavering hatred for Microsoft, that we must all leave for better pastures. The two biggest players in the space that are still independent are GitLab and Bitbucket. Are they _really_ all that much better than GitHub?
## They're all for-profit businesses
The most remarkable thing to me in reading the public's reaction is that people make it seem like GitHub was an open source non-profit company. That because they were the premiere company to publish your open source code, it must mean they were pure in intentions.
They're very much a for-profit and primarily make money off of storing private repositories. GitLab and Atlassian are no different! In fact, now that GitHub is part of Microsoft, they're a public company just like Atlassian.
Even though the business models for each are somewhat different in how they make money off of the code storage, the fact is each of those companies is closed source, with specific legal terms to designate what is owned by the contributors, and what is owned by the storage company. And they're _all_ trying to make money (and there's nothing wrong with that).
## You can still deploy your code to mega-corps
Even if you go with GitLab or Bitbucket, where are you going to deploy your code? Because unless you plan on buying your own servers and scaling like we did in the 90s and 2000s, you're going to use one of a few big players:
1. DigitalOcean. A company whose debt is mostly [financed by megabanks](https://www.crunchbase.com/organization/digitalocean#section-locked-charts) like HSBC and Barclays.
2. Heroku. A company now owned by Salesforce, a publicly traded company worth nearly $100BB.
3. Amazon Web Services. A company with a bigger market cap than Microsoft.
4. Azure. Which is owned by you know who...
So even if you self-host your code for the sake of using git with your team, you're still handing over your code to another massive company when you go to turn the lights on for customers. And you may have already done that deed with Microsoft.
## Look at the terms
The GitHub terms of service only mention open source [in section D.6](https://help.github.com/articles/github-terms-of-service/#d-user-generated-content):
> This is widely accepted as the norm in the open-source community; it's commonly referred to by the shorthand "inbound=outbound". We're just making it explicit.
This isn't even in the ownership section, which clearly states:
> You retain ownership of and responsibility for Your Content...we need you to grant us — and other GitHub Users — certain legal permissions, listed in Sections D.4 — D.7. These license grants apply to Your Content. If you upload Content that already comes with a license granting GitHub the permissions we need to run our Service, no additional license is required. You understand that you will not receive any payment for any of the rights granted in Sections D.4 — D.7. The licenses you grant to us will end when you remove Your Content from our servers unless other Users have forked it.
You can continue to read the rest of those sections, but the real key nugget that you're probably worried about is the legal permissions that apply to you:
> We need the legal right to do things like host Your Content, publish it, and share it. You grant us and our legal successors the right to store, parse, and display Your Content, and make incidental copies as necessary to render the Website and provide the Service. This includes the right to do things like copy it to our database and make backups; show it to you and other users; parse it into a search index or otherwise analyze it on our servers; share it with other users; and perform it, in case Your Content is something like music or video.
>
> **This license does not grant GitHub the right to sell Your Content or otherwise distribute or use it outside of our provision of the Service.**
The rest of those sections simply state that if you post public code it can be viewed by anyone, if you contribute to a repo it's bound to the repo's declared license, and when you upload code it comes with "moral rights," meaning you have the right to seek attribution and credit for the code you have published.
So how does this compare to ownership and IP for GitLab and Atlassian? [For GitLab](https://about.gitlab.com/terms/#subscription):
> 4.3 Customer and its licensors shall (and Customer hereby represents and warrants that they do) have and retain all right, title and interest (including, without limitation, sole ownership of) all software, information, content and data provided by or on behalf of Customer or made available or otherwise distributed through use of the Licensed Materials (“Content”) and the intellectual property rights with respect to that Content.
And [for Atlassian](https://www.atlassian.com/legal/customer-agreement):
> 7.4 Your Data. “Your Data” means any data, content, code, video, images or other materials of any type that you upload, submit or otherwise transmit to or through Hosted Services. You will retain all right, title and interest in and to Your Data in the form provided to Atlassian. Subject to the terms of this Agreement, you hereby grant to Atlassian a non-exclusive, worldwide, royalty-free right to (a) collect, use, copy, store, transmit, modify and create derivative works of Your Data, in each case solely to the extent necessary to provide the applicable Hosted Service to you and (b) for Hosted Services that enable you to share Your Data or interact with other people, to distribute and publicly perform and display Your Data as you (or your Authorized Users) direct or enable through the Hosted Service. Atlassian may also access your account or instance in order to respond to your support requests.
**TL;DR: you own your code.** In fact, if there was one company in that three that I am worried about, it's Atlassian. **Atlassian can collect, copy, store, modify, and use your data.** This isn't spelled out with GitLab or GitHub. So don't think the alternatives are any better for you now that Microsoft owns GitHub.
# Assuming it's all good, what can go wrong?
So if you're convinced that you don't need to jump ship, the truth is things _could_ go wrong. Here are the things to look for when the deal ultimately goes through and the acquisition is 100% complete:
1. **Look out for a GitHub conversion announcement.** Just because they've announced an acquisition does not mean the services have completely rolled over to Microsoft. In fact, the current terms of service for GitHub are dated May 25th which is pre-announcement. If things change and either GitHub or Microsoft announces the rollover is complete, make sure to read the changes carefully. If they change their terms, they'll make you accept or reject the new terms.
2. **If they introduce new terms, see who owns them.** As I mentioned earlier in LinkedIn's conversion announcement, they stated that they still owned their own separate terms of service. This will likely be the case for GitHub as well. If that isn't the case, you'll have to read Microsoft's terms as well as GitHub's. Like class inheritance, whatever you read on GitHub is layered on top of what Microsoft is applying to their software licenses.
3. **If the terms change, specifically check Sections D.3-D.7 for change in ownership or rights.** That's the core section that deals with who owns and distributes the code. If this becomes more restrictive or unfavorable to your community, that would be a good time to re-evaluate your source code hosting strategy.
Is this acquisition a big deal? Yes. Is it a bad deal for you? Probably not. Keep calm and code on, but if the terms change, read them carefully and re-evaluate if they are no longer favorable to your interests.
]]>2018-06-04T21:13:00+00:00How to Improve on Naming Contexts in Domain-Driven Design
https://www.adamconrad.dev/blog/how-to-improve-on-naming-contexts-in-domain-driven-design/
Wed, 30 May 2018 23:25:00 +0000https://www.adamconrad.dev/blog/how-to-improve-on-naming-contexts-in-domain-driven-design/ How would I describe `Question` and `Answer` to a friend in one word?
First of all, we can look at the description I gave in the beginning. In addition, we remember that contexts are nouns, so when I'm parsing those first descriptive sentences, I see words like **forum** and **site**.
Another technique is to bring these models into reality. Where do you see Q&A in the real world? At tech talks, for example, it's usually at the end of a presentation where the speaker has a discussion with an audience. Using the noun technique in the previous sentence, potential context ideas are **presentation** and **discussion**. Now we can apply the domain-driven part of our design to ask ourselves:
> What domain does this belong to, and are there any conflicts or ambiguities we need to worry about?
For presentation, this belongs to the domain of things like public speaking and business. The problem is, presentations are also conflated with the MVP (Model-View-Presenter) pattern. It is also a specific format for PowerPoint. Given that this word has specific connotations with programming jargon, we can rule this one out. That leaves us with **discussion** and **forum** as our context for our Q&A models. Honestly, either will work here, but one hint is that forum relates more to this particular domain than discussion does, so I think we'll go with forum.
Now we can repeat the process for `User` and `Organization`. Since these aren't on the same level, we could argue that they shouldn't be in the same context, but we have to give them _a_ context because there really isn't a point in making a 1-to-1 mapping of models to contexts for an application this small. But we also see a natural mapping here: people belong to organizations all of the time, whether it's a company or a volunteering outfit.
How do we describe people in an organization? We say they're organized by some hierarchy and are usually grouped into teams or divisions. That gives me a few nouns: **hierarchy**, **grouping**, **team**, and **division**. I can already see that team and division are immediately out since that is an example of how organizations are split and don't fully encompass the user-organization relationship. Grouping is good, but the word *group* itself does creep a bit too close to computer terms such as `GROUP BY` or the C# `grouping` keyword. So I'll go with **hierarchy**.
![Quora domain modeling example with contexts](/assets/images/quora-ddd.png)
Nice work! We have a clear designation between two contexts that make sense. One thing to point out is that **contexts can be renamed or changed later**. It's not important that we anticipate the use of these contexts as we add models. Sure, `Hierarchy` might limit our flexibility, as it could be a stretch to add in helper models like `Address` into this context, but you can worry about refactoring later.
### Example 2: Twitter
Twitter is a microblogging service that allows people to express their thoughts in a very limited space. All messages (called "tweets") have a 280 character limit and are broadcast to followers. Followers can then interact with their connection's messages by liking or sharing (via "retweeting") content that they think is valuable to their own networks.
![Twitter domain modeling example](/assets/images/twitter-no-ddd.png)
What did you come up with?
Let's again apply the same heuristics from above to decide what our contexts will be. The first thing that stands out to me is the intimate tie between `User` and `Tweet`. It's the absolute core of this platform. Simplifying the noun technique, my first thought was that this _is_ the **microblog**. However, given that Twitter has increased its word count from 140 to 280 characters, I feel like the *micro* portion of this is basically irrelevant, but the **blog** portion aptly contextualizes this relationship. Plus, it's a natural context for future potential models like `Comment`.
Next, there are the user-related interactions: following/followers is largely abstracted away since it's a `User` to `User` relationship. `Like` and `Share` are new models, and do have similar mechanisms that feel like they could be grouped. What did I just call these things? **Interaction**, that's one (in fact, I also used it in the opening paragraph). I also used **network** which would describe the following/follower relationship, but again, networking is computer jargon and that could collide with a lot of other concepts we are working with. I think **interaction** is a good one:
![Twitter domain modeling example with contexts](/assets/images/twitter-ddd.png)
This one was a bit easier given our previous look, but again, I really want to stress that **there is no right answer**. All we are attempting to do here is make it easy for you to figure out the naming, but at the end of the day, there's no actual penalty in naming or grouping these however you like. In Phoenix, there is certainly some cost with talking between contexts, but it's not an insurmountable issue.
### Example 3: Google Drive
Google Drive is a cloud file storage provider. You can access files from anywhere in the world on any device, provided you have a Google account. In addition to reading files, you can collaboratively write to files with other users who have permission to read (and write) these files. Further, you can collaborate via tools like comments and sticky notes to provide feedback even if you cannot directly write to the file. Google Drive does not provide universal file capabilities and is mainly focused on reading/writing for documents, spreadsheets, and slideshow presentations.
![Google Drive domain modeling example](/assets/images/google-drive-no-ddd.png)
This one is a bit more complex. We have quite a few more models here with a bit more ambiguity for how we can split things up. Let's start with the easiest one that all exist on the same level: `Document`, `Spreadsheet`, and `Presentation`. How did I group those in the above description? I called them all **files**. Normally we'd red-flag this context because it's too computer-specific of a domain, but remember, this service is _very_ computer focused by its very nature, so it's actually okay that we give this a context like this.
The next layer up we have our `IO` which handles permissions for file read and write access controls and `Comment` which is a special kind of write. Finally, we have the `User` at the very top, which touches everything. Now your gut might tell you that reading, writing, and commenting are **interactions** just like in the Twitter example. But we can't do that for a few reasons:
1. The methods for determining read and write access are encapsulated within `IO` - they are permission grants to unlock certain functionality, but `IO` in of itself is not an interaction
2. Remember, `User` has access to all of these models. Even though this UML diagram lays everything out in a hierarchy, it's not a simple three-tiered relationship, so in a way, we _interact_ with our files as well by virtue of owning them.
Let's offer a different narrative. We mentioned earlier that we can collaborate with other users to perform actions. The noun here is **collaboration**. Additionally, those actions are governed by **permissions**. Which do we choose?
This is probably the toughest one of all. Collaboration implies an innate connection with others, even though you can manipulate a file without any other users. Whereas permission makes sense with both user interaction and `IO` but seems to leave `Comment` out in the cold. The key clue we have is that **`Comment` is a subclass of `IO`, so we can lower the priority of making `Comment` work for our context**. In other words, since we know permission is a sensible context for both `User` and `IO`, it stands to reason that **permission** will be a good context name because `Comment` is a type of `IO` in this domain.
![Google Drive domain modeling example with contexts](/assets/images/google-drive-ddd.png)
This provides lots of ability to expand with both file types and ways of handling things centered around the `User`, such as authorization and authentication. Both contexts also pretty cleanly separate the levels of hierarchy between data models.
## Wrapping up
Let's summarize our heuristic into three simple steps:
1. **Map the domain.** We used UML diagrams to lay out the hierarchy of our models to provide visual clues for possible contexts.
2. **Horizontal or vertical alignment provides clues.** If something falls along the same plane or area of your diagram, you can bet it's a context.
3. **Describe that alignment in one word, preferring the noun with the more specific domain.** This is the hardest part, but the idea here is to take the description of our application and pick out the nouns (and sometimes verbs) that associate our related models. From there, we simply prune the word that has the clearest message with the least ambiguity.
Naming contexts is hard. It can be made easier if you follow these guidelines the next time you write an application using domain-driven design.
Make sure you don't stop with just your first iteration. Growing applications require consistent refinement. Sometimes you'll need to split off a context into multiple contexts, while other times you'll need to consolidate contexts. Don't be afraid to go in either direction. Finally, don't be afraid to be wrong, because there is no definitive right answer either. Good luck and happy naming!
]]>2018-05-30T23:25:00+00:00How to Load Images to Cut Page Time
https://www.adamconrad.dev/blog/how-to-load-images-to-cut-page-time/
Thu, 24 May 2018 18:25:00 +0000https://www.adamconrad.dev/blog/how-to-load-images-to-cut-page-time/ 0) {
imageCount--;
observer.unobserve(entry.target);
preloadImage(entry.target);
}
}
}
var applyImage = function(img, src) {
img.classList.add('js-lazy-image--handled');
img.src = src;
}
if (!('IntersectionObserver' in window)) {
loadImagesImmediately(images);
} else {
observer = new IntersectionObserver(onIntersection, config);
for (image in images) {
var image = images[i];
if (image.classList.contains('js-lazy-image--handled')) {
continue;
}
observer.observe(image);
}
}
## How it works
This script works by looking for images that have the class **js-lazy-image**. Those images must have a data attribute called data-src which has the location of the image, just like the normal src attribute would be filled out. The script then counts the number of images with this class, and when the viewport intersects with an observable area that includes the image, the source of the image is loaded from it’s data attributes, and the image renders on the screen! Here’s an example of how you would call lazy loaded images:
As you can see, this works almost exactly like a regular image, but you just need to make sure to add on the special JavaScript class and add a **data-** prefix in front of your src attribute. And that’s all it takes to go from this:
![](https://cdn-images-1.medium.com/max/3998/1*ZIW_GhTLsCzE23pXO_7ulg.png)
To this:
![](https://cdn-images-1.medium.com/max/3998/1*jVANxQdyxpl8UQ3gmmLr8Q.png)
## How the script works — section by section
Still curious as to how the script works? Let’s break it down by section:
var images = document.querySelectorAll('.js-lazy-image');
var config = {
rootMargin: '50px 0px',
threshold: 0.01
};
var imageCount = images.length;
var observer;
Here we initialize our variables. Remember that class we added to our images? This is where we collect our images. The configuration will be used later with our observer class (don’t worry if you don’t know what that is, we’ll get to that in a second). Finally, we store the image count because we’ll be using that a few times throughout our script.
var fetchImage = function(url) {
return new Promise(function(resolve, reject) {
var image = new Image();
image.src = url;
image.onload = resolve;
image.onerror = reject;
});
}
Our first function deals with grabbing our images. We make use of [JavaScript Promises](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Using_promises) to asynchronously load our images. If the Promise was resolved successfully, we load the image with the URL for our image source, otherwise we display an error. So where is this called from? Glad you asked…
var preloadImage = function(image) {
var src = image.dataset.src;
if (!src) {
return;
}
return fetchImage(src).then(function(){
applyImage(image, src);
});
}
This image preload function grabs the image source from the data attribute we tacked onto our image tag. If it doesn’t find a source object, no problem, we just stop right there. Otherwise, we fetch our image, and if things go great, we apply the image to the DOM like so:
var applyImage = function(img, src) {
img.classList.add('js-lazy-image--handled');
img.src = src;
}
Our function simply finds the image we’re about to reveal, adds on a class to let our script know it has been handled, and then adds the URL to the image’s src attribute.
if (!('IntersectionObserver' in window)) {
loadImagesImmediately(images);
} else {
observer = new IntersectionObserver(onIntersection, config);
for (image in images) {
var image = images[i];
if (image.classList.contains('js-lazy-image--handled')) {
continue;
}
observer.observe(image);
}
}
This is the meat-and-potatoes of our script. The core functionality that enables our lazy loading to happen quickly and efficiently revolves around the [Intersection Observer API](https://developer.mozilla.org/en-US/docs/Web/API/Intersection_Observer_API). This API allows us to track changes to the viewport against target elements asynchronously.
Traditionally you would have to do something like this with the [Scroll Event](https://developer.mozilla.org/en-US/docs/Web/Events/scroll) which is both slow and called constantly. To combat this performance hiccup, you might using [debouncing or throttling](https://css-tricks.com/debouncing-throttling-explained-examples/) to limit the number of scroll event requests. But with Intersection Observers, this is all handled for you.
Remember the config variable at the top of our script? This is the configuration for our Intersection Observer. It tells us that when we are within a margin of 50px (vertically) from our image, that is when we want to activate our observation callback. The threshold, as you might guess, is the tolerance for what percentage of the object must be observed in order for the callback to be invoked once the margin is reached. In our case, we chose 1%, which is immediately upon bringing the image tag into view.
So now that we have that background, we can see how this if statement works. If we see that Intersection Observers are a part of the window object, we know we are in a browser that supports this functionality. As of right now, [Intersection Observers are available on all major browser except IE and Safari](https://developer.mozilla.org/en-US/docs/Web/API/Intersection_Observer_API#Browser_compatibility). So if you are on IE or Safari, you will load the images immediately. If you aren’t, we create a new Intersection Observer object with the configuration we provided in the beginning, as well as a callback function to trigger when our target observation is reached.
Finally, we have to tell the Observer exactly what it has to observe for the callback to be initialized. In this case, we are observing all of the images that haven’t already been handled, which are the images that haven’t been applied to the DOM yet (via the applyImages function we saw earlier).
So what does loading images and the observation callback look like?
var loadImagesImmediately = function(images) {
for (image in images) {
preloadImage(image);
}
}
For loading images immediately, it’s pretty straightforward. We simply preload all of the images and put them on the screen like we normally would.
var onIntersection = function(entries) {
if (imageCount === 0) {
observer.disconnect();
}
for (entry in entries) {
if (entry.intersectionRatio > 0) {
imageCount--;
observer.unobserve(entry.target);
preloadImage(entry.target);
}
}
}
Our intersection callback is a bit more involved. If we have loaded all of our images that have our lazy loading CSS class, we’re done and we can disconnect from our Observer object. Otherwise, for every observer entry in our IntersectionObserver, we want to activate our images. We do that by ensuring we have reached our threshold. **intersectionRatio** is the property we need to see if our target image element is visible within the threshold we defined in our configuration. If it is, the property returns a 1, otherwise it returns a 0. So if we have landed within our necessary ratio, we have one more image to add to the DOM, which means we can remove 1 from our count of images remaining to load. We can therefore stop observing this image because it’s going to be loaded onto the page. Finally, we use our previously-defined preloadImage to execute our now familiar process of adding the image URL onto the image tag and loading the image into the DOM.
## Next steps
**Lazy loading images is a quick and painless way to drastically improve the performance of your pages that use a lot of imagery.** From there, be sure to [compress your images](https://adamconrad.dev/the-fastest-way-to-increase-your-sites-performance-now/) and use the correct image format so you keep a small footprint. Tools like [MachMetrics](https://www.machmetrics.com/) are a great way to track your performance improvements over time, as well as provide additional suggestions on how to continuously improve the performance of your application.
What other quick wins do you have for speeding up your site? Leave a reply in the comments below!
*Originally published at [www.machmetrics.com](https://www.machmetrics.com/speed-blog/how-to-lazy-loading-images-script-slash-page-load-time/) on May 24, 2018.*
]]>2018-05-24T18:25:00+00:00Here is the Full List of My 50+ Remote Job Sites
https://www.adamconrad.dev/blog/here-is-the-full-list-of-my-50-remote-job-sites/
Tue, 15 May 2018 23:25:00 +0000https://www.adamconrad.dev/blog/here-is-the-full-list-of-my-50-remote-job-sites/2018-05-15T23:25:00+00:00The Fastest Way to Increase Your Site's Performance Now
https://www.adamconrad.dev/blog/the-fastest-way-to-increase-your-sites-performance-now/
Wed, 25 Apr 2018 04:25:00 +0000https://www.adamconrad.dev/blog/the-fastest-way-to-increase-your-sites-performance-now/
At this point, you've already exhausted even the most advanced compression optimization techniques. And while these programs do all of the work for you in a matter of seconds, the truth is you may need to scrap your images altogether if they are in the wrong image format.
How do you know if they're in the right format? Follow this simple questionnaire and then save your image into the new file format (and then repeat levels 1 and 2 to compress your new images!).
#### Does it look like a photograph? Use a JPG
JPGs were meant to be used with photographs. If you have avatars of your team, photographs of your office, or other real-life images, make sure they are in the JPG format.
#### Does it look like a computer-generated graphic or drawing? Use a PNG
Everything else can be a PNG. Other formats may provide better quality, but if you're serving a web or image application, a PNG will do and is universally read by every kind of application. The one exception to this would be icons...
#### Does it look like an icon? Use an SVG
Icons are also computer-generated, but the difference is that icons are generally used alongside typography. If you think the image will be used in a menu, on a button, or is meant to symbolize something, it's probably an icon, and icons will benefit from being SVGs because they can be scaled along with your type and lose 0 fidelity. They will look just as crisp as your fonts and will be much smaller as SVGs as well.
#### Are you supporting the latest browsers and don't care about Firefox? Use WebP, JPEG 2000, and JPEG XR
Finally, there is a push for next-generation image formats. JPG and PNG have been around for more than two decades, and it's about time we have some new formats that innovate to maintain image quality without bloating our applications. Even better, they don't require you decide between image type. For example, WebP works great for both photographs and computer-generated images.
The downside is that support is fragmented across devices and browsers. [WebP](https://developers.google.com/speed/webp/) was made by Google, so it's naturally designed only for Chrome and Chrome mobile. JPG also has evolved formats, but [JPEG2000](https://jpeg.org/jpeg2000/index.html) is only supported by Apple (Safari and Safari Mobile), while [JPEG XR](https://jpeg.org/jpegxr/index.html) is only supported by Microsoft (IE and IE Edge).
What about Firefox? There is no next-gen format for this browser, but they do have a [2-year-old bug ticket](https://bugzilla.mozilla.org/show_bug.cgi?id=1294490) to implement WebP and it is assigned, but who knows when this will land.
### Level 4: Source sets & fallbacks
If you've chosen the correct image and you've compressed the heck out of it, you'll want to make sure it's being served on the right device in the right aspect ratio. I alluded to this back in Level 2, but if you have concerns about Retina displays like MacBook Pros, or even x3 quality for the newest iPhones, you'll want to ake multiple copies of your image in all of these formats.
Going back to the previous example, if you serve avatars in a 64x64 JPG format, you'll also want to make copies of dimension 128x128 for Retina _and_ 192x192 for Retina x3. The best way to do this is to **start with a larger image and scale down, rather than scale up**. You know those crime dramas where they ask the hacker to "_Zoom! Enhance!_"? We all know that doesn't work in real life, and that same thing holds true for your images - you can't add clarity where there was none in the first place.
Instead, start with the original source image (say, 512x512) and scale down to 192, save a copy, then 128, save another copy, then 64, and save that last copy. This will result in a less blurry, albeit still lossy (because you are removing information in the form of pixel fidelity) set of images.
With all of these duplicate, scaled images, how do we tie this all together? The image attribute known as `srcset` comes to the rescue:
```
```
`Srcset` is pretty amazing. It will always default to the original `1x` magnifier like a normal image tag would. However, if it does find the other images, it will apply them to the given aspect ratios for your device. In other words, if you are viewing the photo on a 2015 MacBook Pro, the browser will select `avatar-md.jpg`, but if you are on an iPhone X, it will select `avatar-lg.jpg`. And if you're in a military bunker using IE8, it will fall back to `avatar-sm.jpg`.
`Sizes` is another property for responsive images but it relies on the width of the device rather than the pixel density. The format is the same:
```
```
You specify an image, space, the size the image should be displayed, with a `w` descriptor for the `srcset`, and then, using the `sizes` attribute, specify the media queries at which point the various sources should be used.
The only downside? **`Srcset` is not supported in IE**. It _is_ supported in IE Edge, and the `sizes` attribute is supported everywhere. In my honest opinion, this isn't something to worry about because all of the Retina devices using IE are already new enough to support IE Edge. Anything that still requires IE11 and down likely doesn't have a Retina display anyway (unless it is connected to an hi-density external monitor) so you likely won't run into this problem being a real blocker for you.
## Something is better than nothing
This is not an exhaustive list of image optimization techniques nor is it meant to be a prescriptive formula. **Even if you only run ImageOptim on all of your images in your application, you could be saving upwards of 60-80% of what your users have to download.**
Images, audio, and video, _not_ your source code, will comprise the largest space for your application by far. Effectively choosing, compressing, and displaying your images will have a marked impact on both application size and performance for your end user experience. The best part is it only takes a few downloads and a few more seconds to run your images through a very easy set of tools that will provide an instant improvement without sacrificing quality.
]]>2018-04-25T04:25:00+00:002020-02-27T23:00:19+00:00