Deliberate Thinking: integration

Showing posts with label integration. Show all posts

25 September 2019

Stress on a Cracked Foundation

Think about a house standing strong. Imagine a crack in the foundation. Maybe the ground under the house has settled unevenly. Maybe things have shifted since the house was built.

If it has gone long enough that the foundation is cracked, there is a failure waiting to happen. Putting enough stress on a cracked foundation will lead to a dramatic failure.

This is like living at the edge of health: physical, social, emotional, or spiritual. If you aren't constantly investing in repairing and strengthening your foundation, you are more susceptible to unexpected failure.

I want to eat healthy food, keep in touch with good friends, keep enough space in my life, and stay strong in my faith. So that when the storms of life come, I can get through the challenging times. When I ask God for help, I can be more confident I will be able to receive His help to bridge me back to a stable and happy future.

03 January 2012

Wish TwitterFeed supported auto-hashtags

After using TwitterFeed for a while, it's really cool! Props to betaworks for an awesome tool!

All I want now is auto-hashtag support.

Just look for the last line of a post that is only hashtags, and #dont #take #arbitrary hashtags from the #body of the #post.

Don't bother with trying to convert blog post tags into hashtags -- I want a different vocabulary of hashtags on twitter vs. tags on my blog.

#twfeed #hashtags

06 April 2011

Core Goals of Version Control

After having a couple of years of experience with Git, I was finally able to clearly articulate the core value that version control brings to software development. This article will be written with a bias toward Git, but I've at least tried to express the core value in abstract terms.

The value is expressible in 4 goals.

Core Goals

Capture source artifacts
Isolate stable codelines from WIP
Enable concurrent WIP
Make it easy to leave sensible history behind

WIP = Work In Progress

The continuous delivery crowd says that WIP should be basically zero. I think that there is space for a fairly short development pipeline (3-5d) that can free developers up to capture ideas and then wrangle them into shape in a second or third pass.

Capture Source Artifacts

As a developer, I feel happy when I can push a save button when I've taken a small step -- and then never have think about it again. I also feel happy when I can save, even if I'm based on a slightly out-of-date base. I also feel happy when I don't have to worry about binary vs. text -- as long as its a true source file of a reasonable size.

I feel happiest when I don't have to be distracted by superfluous concerns, and can focus on the task at hand and save the results easily.

Main Point: A good version control system makes it easy to capture source artifacts.

Isolate Stable Codelines From WIP

Software is hard to get right, and once things get stable, the only way to keep them there is to avoid making risky changes. But stability is always balanced against the common desire for enhancement and restructuring.

A typical bug fix is one or two isolated commits that fix a specific problem. Most version control systems more or less support cherry-picking a small change over into another codeline.

I feel happy when I'm able to easily take an isolated series of commits from the development codeline into a stable one, even if the fix turns out to be a little larger than a couple commits.

Main Point: A good version control system makes it easy to cherry-pick changes, and even longer patch sequences, from one codeline to another with easy reconciliation of overlapping edits.

Enable Concurrent WIP

For software developed by a team larger than 2-3 people, it is useful to be able to work in parallel with each other, even on the same feature. Sometimes you can arrange your work so as not to overlap at all, but there are real situations where you want to track someone else's development within the fairly short development pipeline (3-5d) before things have gotten stable.

I feel happy when I don't have to catch people up on what I've been doing, and when they don't have to catch me up on what they've been doing, and we can get to the point easily and quickly. I also feel happy when we can leap-frog each other with ideas and real implementation.

Main Point: A good version control system makes it easy to track other team members' work, and easy to tentatively integrate with that work.

Make It Easy To Leave Sensible History Behind

After stepping away from a piece of software, it is easy to lose context and forget the concerns that shaped the development of that software. It doesn't take long. For me, about 2 weeks is enough for me to start forgetting details and motivations.

Of course, it is important to leave a properly structured artifact behind. Proper naming means a lot; proper factoring is important. Once you comprehend the structure, it is possible to change things without introducing big problems.

But often, full comprehension isn't practical, and a flat 2D view of the artifact is insufficient for reasoning about the thing itself. Often, it is often very useful to be able to get a vector on where the software came from, and interpolate from that vector where it was headed.

The creative process is messy, filled with double steps and backtracking. Don't confuse me with all the noise. Give me a history that, given all the knowledge you had when it got stable, at least looks well-reasoned.

I feel happy when I'm able to take an easy editing pass following capture/review that lets me intentionally craft history as an email to the future about the vector of change that is inherent in a certain patch sequence. I also feel very happy to not have to think about this during the creative, messy, focus-draining capture/review pass.

Main Point: A good version control system includes tools to both: 1) enable the developers to easily leave behind meaningful history, and 2) extract focused history for all or part of a piece of software.

12 July 2010

How to rewrite a complex test

An Integrated Test is suboptimal for asserting basic functionality and basic sub-component collaboration behavior.

So if you have a massive Integrated Test, how is it possible to rewrite that test into some number of the following kinds of tests?

focused unit test
focused collaboration test (how one class collaborates with another)
systems-level integration test (load balancer behavior, queuing system behavior)

I think it comes down to the following activities:

enumerate the different permutations of state
enumerate the different permutations of flow
for each permutation of state: create one focused unit test
for each permutation of flow: decide whether 1) the permutation of flow devolves into a sub-component collaboration test, or 2) into a systems-level integration test
create the required focused collaboration tests
create any required systems-level integration tests (usually very rare)

There is an interesting smell that comes from the activity of creating tests. There may be an existing test that is responsible for asserting the focused behavior, but it isn't in the right place, so it is hard to find out whether it exists. In this case, the act of "create focused test" implies the act of "move focused test into its rightful home" (so others, including yourself, can find it later).

In a meeting in Feb. 2010, I wrote the following about problems I've experienced with a test suite at work:

Inability to run a test context-free => high re-run costs and downstream delays
Too much custom test infrastructure => high maintenance costs
Risk of centralized integration => waiting on central integration before shipping

The approach I suggested was to find the top 20% costliest tests, and focus on those.

I suggested measuring "costliest tests" using a combination of the following criteria:

How many superfluous assertions in this test?
How many superfluous historical failures has this test generated in the last 6 months?
How long does it take to run this test?
How many "permutations of state" is this test trying to cover?
How many "permutations of flow" is this test trying to cover?
How far away from the code is this test?
Is there a place closer to the code where those "permutations of state and flow" can be adequately tested?
Are there ways to ensure all the "permutations of flow" can be covered without having to mix the test with trying to test all the "permutations of state" at the same time?

The whole idea is to simulate expensive parts of our tests in a way that still gives us the confidence that the test is valid and covers the desired integration case.

Where and how to test what

J.B. Rainsberger wrote in the fall of 2009 about why he thinks typical programmer use of "Integrated Tests" leads to a vicious cycle of denial and suboptimal behavior.

His overall ideas were summarized well by Gabino Roche, Jr.

And there are good uses of integration tests.

I was about to hit the delete button on this post, because I thought all I had to say had already been said. But there was still something to say: How do I personally work in a way that avoids the Vortex of Doom?

The key idea that has me personally is to pause, and ask the following question:

What is the responsibility of this test?

and then to consider the answer to the related question:

What is the responsibility of the class being tested?

Of course, those are fairly basic OO questions. However, when you're writing tests along with the code, there is a situation that is easy to get stuck in: having so many things in mind at once, that you get confused about the purpose of the test, and even the software you are working to create.

There are at least three things that tend to compete for mind space:

What observable behavior do you want out of your software?
How do you think you might implement that?
How does what you are building interacts with the rest of your system?

And, when #2 gets the top spot in my mind, I find myself forgetting about #3, and resorting to copy/paste for #1 (from other similar tests). However, when I focus on #1 and, by extension, #3, I find myself getting new ideas about how to actually implement the new behavior.

In addition, I find that these new ideas are reorienting in nature. The new stuff I'm working on ends up either modifying an existing concept in a novel way, or introducing a new concept that collaborates with the existing system in a certain way. Then the test I thought I was going to write ends up being a straightforward unit test on a new class, or new methods of an existing class. And a couple of collaboration tests that make sure the new behavior actually gets used.

In the end, there are a few questions that need to get answered:

Does the new behavior work? (unit tests will tell you this, 80% of tests)
Is the new behavior hooked up? (collaboration tests will tell you this, 15% of tests)
Does the whole thing hold together? (automated deploy to a production-style test site with system-level smoke tests will tell you this, 5% of tests)

And the system-level smoke tests are only responsible for making sure that certain itemized flows work, not all permutations of state that go through those flows.

Hopefully this is a useful addition to the already-posted conversations started in 2009.

29 April 2010

Reality Quotient

There is a fairly subjective measure I've only recently been able to give a name to:

Reality Quotient = ability to keep context while working toward a specific goal

This has to do with how deep you allow your stack to be, which if you allow it to get too deep, this affects your net throughput on Cockburn's "unvalidated decisions" or D emarco's "Total Useful Mental Discriminations (TUMD)". If the stack is too deep, you end up wasting a lot of time making useless decisions about things that have ceased having anything to do with the REAL task at hand.

The tendency to accept decisions that pin you into a corner is closely related to lowering the Reality Quotient. There is a whole book about the attitude of Getting Real, and I equate that attitude with a high Reality Quotient.

I did a search to see whether anyone else had published a writeup under the "Reality Quotient" heading. Although I found a lot of stuff on the web, none of it really matched what I wanted to say. The topic of this post is NOT:

whether you are an honest person in general, or in a specific social situation
a manager trying to find out what's REALLY going on in his/her org
whether the actions of time travelers will have any effect
some kind of measure of whether a politician is telling the truth or producing spin
some measure of how "real" a reality show really is

The measurement I wanted to talk about is how capable you are of focusing on the problem you set out to solve until 1) it is truly solved and published to the world, or until 2) you have redefined the problem into another solvable one and published that transformation to the world.

In short, a high Reality Quotient requires a short stack, with tight feedback loops, focused on publishing real stuff to real people.

04 September 2009

An entire sprint without an SVN branch

A typical development flow has been:

request an SVN task branch (wait for CM to help)
do stuff (1-3 weeks)
run all the regression tests (not on the latest stable)
hope that nobody did anything that conflicted with us
merge to the pending release (and pray we resolve any conflicts correctly)

Now we do the following:

set up a team git repo (able to do this without CM's help)
clone off a repo for each team member
set up a daily cron job to get the latest stable from the pending release
set up a daily hudson build to merge that in and push to team if it passes unit tests
pull from team as we go (fast!)
run all the regression tests (on the latest stable)
use git to merge on top of the pending release and commit back to SVN (with minimal conflict windows)

Well, we finally got through a whole sprint without an SVN task branch. Maybe that seems like a small victory, but it saved us a bunch of late-breaking integration risk.

28 April 2009

Nice benefits of git

I use git now and it saved my bacon last week. Two things made a really hard, late-breaking change possible:

Ease of integration enabled parallel development.
Smallness of commits meant easy bug isolation.

Ease of Integegration => Parallel Development
Because of the ease of integrating small commits, my coding companion and I were able to develop two complementary sets of changes in parallel. This saved us quite a bit of time because we didn't have to wait for each other. The changes were developed as a last-minute response to a problem right before release, and without the ease of integration, it would have been hard to get the problem right on a compressed schedule.

Small Commits => Easy Bug Isolation
Also, because commits are so small (because it's easy to commit without any side-effects, because your repo is isolated from everything else), it was easy to isolate the exact changes that caused a subtle problem. I could have used bisect, but I had an inkling which change caused it, so that was faster. With svn or a mongo patch accrued over multiple changes, it would have been much more confusing and disorienting.

19 March 2009

Tracking work to be done

My team had the task of porting code off of an old API onto a new (hopefully-mostly-equivalent) API.

Diomidis Spinellis wrote an article in IEEE Software titled "Start with the Most Difficult Part" (on his blog here, official reference here). Now seemed like a good time to apply the advice.

So I wanted to measure: What was the most difficult part. I wanted to know how many usages of the old API were present and where I should focus my efforts first.

So I wrote macker rules to detect usages to port. And I wrote a ruby script that drove the ant macker target and pulled the usage counts into a list of counts. Then I published the current counts and started a graph so we could get the usages to zero.

Here is the chart we ended up with:

This drove us to two useful behaviors:

Getting our tests converted first, the most complex one first (exposed a lot of issues early).
Giving our work a metrics-driven feel -- all we have to focus on is getting this to zero.

There were also two things I missed:

The hardest task to port was one in which the new API didn't have anything near what we needed, and the usage measurement didn't catch that. ==> A quick pass over the tasks looking for especially risky ones (where the port would be really hard) would have exposed that earlier.
Another team was doing other work that intersected with ours, and we were unaware of the impact of their efforts on what we were doing. ==> Being able to get their changes quickly and easily may have helped expose the problem earlier.

Now that we've started using git, I'm going to suggest that we do two things:

Get an SVN task branch so we can integrate as a team during the sprint
Get an SVN task branch right before merging to release and re-apply all our git patches onto that task branch so the merge is really smooth

Or maybe if we can get done in time, we could volunteer to do that for the team that has the sprint-long task (getting them on everyone else's changes with a minimum of svnmerge.py headache).