I want to start with a simple analogy to the (ordinary) derivative. So suppose that ω is a k-form, and X1,…,Xk are vector fields. And for the moment, I want you to imagine that the Xi fields are all "constant near some point p. Now that doesn't really make sense (unless you're in Rn) but bear with me. If p is the north pole, and X1(p) is a vector field that points towards, say, London, then it makes sense to define X1 near p to also point towards London, and those vectors will (in 3-space) all be pretty close to X1(p).
Then we can define a function
f(q)=ω(q)[X1(q),…,Xk(q)]
defined for q near p.
How does f(q) vary as q moves away from p? Well, it depends on the direction that q moves. So we can ask: What is
f(p+tv)−f(p)?
Or better still, what is
f(p+tv)−f(p)t?
especially as t gets close to zero?
That "derivative" is almost the definition of
dω(p)[X1(p),…,Xk(p),v].
There are a couple of problems with that "definition" as it stands:
What if there are multiple ways to extend Xi(p), i.e., what if "constant" doesn't really make sense? Will the answer be the same regardless of the values of Xi near p (as opposed to at p)?
How do we know that dω has all those nice properties like being antisymmetric, etc.?
How does this fit in with div, grad, curl, and all that?
Problems 1 and 2 are why we have fancy definitions of d that make theorems easy to prove, but hide the insight. Let me just briefly attack item 3.
For a 0-form, g, the informal definition I gave above is exactly the definition of the gradient. You have to do some stuff with mixed partials (I think) to verify that the gradient, as a function of the vector v is actually linear in v, and therefore can be write dg(p)[v]=w⋅v for some vector w, which we call the "gradient of g at p."
So that case is pretty nice.
What about the curl? That one's messier, and it involves the identification of every alternating 2-form with a 1-form (because 2+1=3), so I'm going to skip it.
What about div? For the most basic kind of 2-form, something like
ω(p)=h(x,y,z)dx∧dy
and the point p=(0,0,0) and the vector v=(0,0,1), and the two "vector fields" X1(x,y,z)=(1,0,0) and X2(x,y,z)=(0,1,0), we end up looking at
f(p+tv)=h(0,0,t)dx∧dy[(1,0,0),(0,1,0)]=h(0,0,t)
and the difference quotient ends up being just
∂h∂z(0,0,0)
That number tells you how ω′s "response" to area in the xy-plane changes as you move in the z direction.
What's that have to do with the divergence of a vector field? Well, that vector field is really a 2-form-field, and duality has been applied again. But in coordinates, it looks like (0,0,h), and its divergence is exactly the z-derivative of h. So the two notions match up again in this case.
I apologize for not drawing out every detail; I think that the main insight comes from recognizing the idea that the exterior derivative is really just a directional derivative with respect to its last argument...and then doing the algebra to see that it's also a directional derivative with respect to the OTHER arguments as well, which is pretty cool and leads to cool things like Stokes' theorem.