Procedure behind Separating Axis Theorem

I recently decided to convert the collision detection system in a work-in-progress game of mine to an SAT approach. Unfortunately, it still hasn’t “clicked” yet. I understand the theory, but a few parts of the procedure are a bit fuzzy to me. Basically, I want to prototype a simple box to box collision detector in which you’ll be able to drag the boxes around. If they are colliding, it will return the penetration vector with the shortest penetration. This is my incomplete procedure:

  • Determine which objects to test collision (easy in said prototype as there are only two boxes)
  • Determine which vectors to project (no idea about this one)
  • Determine the axes on which to project the vectors (the normalized normals of the halfwidth vectors, I think, which means there are only two for box to box)
  • Calculate the distance between both boxes as a vector (vx = (box2.x-box1.x), vy = (box2.y-box1.y))
  • Project the vectors for each box onto the first axis (simple, read a very good explanation of this somewhere)
  • If the sum of the objects radii on that axis are bigger than the distance between the two boxes (in other words, if there is an overlap), then continue; if not, they are not colliding (I think that you somehow use the projected vectors of the boxes to find their radii on that axis, but I’m not sure how)
  • Do the same for the second axis; if there is an overlap on that axis as well, then the two boxes are colliding. Return the penetration vector that is the smallest (the penetration vector is the smallest overlap, but my incompetent brain is telling me that is a scalar number)

As you can see, there are many holes. After succeeding in prototyping box to box, I plan on moving on to box to triangle, which is basically the same as the box to box with the exception that there are three axes. Those are the only two collisions that will need to be detected in my game. I do not plan to use rotated objects, if that will help. Can somebody fill up the holes in my procedure?

And I have read Metanet’s tutorials on SAT, so please don’t direct me over there. It was good at explaining the theory, but a little lacking in the actual implementation for me.