C(s1) < C(s2) + C(s1 | s2) + O(1).
Suppose we can find the shortest program p2 that generates s2 on our machine. Suppose also that we can find the shortest program p21 that generates s2 from s1. The length of p2 is C(s_2), and the length of p21 is C(s1 | s2). The concatenation of p2 and p21 is a program that generates s1, though not necessarily the shortest one (hence the inequality in the chain rule). This only holds for "unambiguous" program codes (so that the machine knows where p2 ends and where p21 begins).