Jump to content
Science Forums

Recommended Posts

Posted

Ya, well, I promised myself I would never again get didactic on a forum. Ho, hum, the urge overpowered me!

 

So, in another thread - which? don't remember - Qfwfq raised a point which I have been doodling over the last coupla days that I want to say a few words about.

 

First, though, let me say that, as my penchant for mathematical formalism seems not to be too popular here, I will try to keep it to a minimum. But some arithmetic must be shown, I believe.

 

Start here.

 

Suppose that [math]V[/math] is an arbitrary vector space over the real field. Then there always exists a "sister" space [math]V^{\ast}[/math] of objects with the property that [math]V^{\ast}: V \to R,\quad \varphi(v) = \alpha \in R[/math] for all [math]v \in V, \, \varphi \in V^{\ast}[/math].

 

This sister space is called the dual space to our original vector space.

 

Now suppose that [math]V[/math] is an inner product space. This simply means that here we know what we mean by the length of vectors and the angle between any two such. The inner product is often written (lazily) as [math] (\,, \,): V \times V \to R,\quad (v,w) = \alpha \in R[/math]. (Ask me why I say "lazily, if you dare!)

 

But in this case, we can always find an element in the dual space such that [math]\varphi(v) = (v,w) = \alpha \in R[/math]. Since the spaces [math]V,\, V^{\ast}[/math] are isomorphic, by definition, one may assume there is only ONE such element in [math]V^{\ast}[/math], and accordingly write [math]\varphi_w(v) = (v,w) = \alpha \in R[/math].

 

I apologize to those who have studied linear algebra for my superficiality, but, for those that haven't, I am open to questions.

 

More - much more - later, if there is a taste for this

Posted

Yes, you were invited to do so.

 

Suppose that [math]A,\,B[/math] are sets. Then we know that [math](a,b)[/math] is an element in the set [math]A \times B[/math] - it is not a mapping of any sort.

 

Likewise, in the case that [math]V[/math] is a vector space, then one may assume that [math](v,w)[/math] is an element in the vector space [math]V \times V[/math].

 

The correct formalism, as I am sure sure you are aware, is this;

 

Since [math]V^{\ast}: V \to R[/math], then [math]V^{\ast}\otimes V^{\ast}: V \times V \to R[/math]

 

Which, to my simple way of thinking, requires an element, call it [math]g_{ij} \in V^{\ast}\otimes V^{\ast}[/math] with the property that [math]g_{ij}(v^i,v^j) = \alpha \in R[/math] where summation over like up/down indices is assumed

 

The gadget [math]g_{ij}[/math] is called, as you know very well, the metric tensor and is, by definition, of type (0,2).

 

This is a very far cry from the "lazy" assumption that [math](\,,\,) \equiv g_{ij}[/math]

 

Anyway, let's not get too technical for now.......

Posted
The gadget [math]g_{ij}[/math] is called, as you know very well, the metric tensor and is, by definition, of type (0,2).

 

Sure it has to be like that? I mean in this case yes, but doesn't the definition of the metric tensor depend on which space it lives in? I mean [math]g^{ij}(v^*_i,v^*_j)=\alpha \in R[/math] would be a metric tensor of type (2,0) living in [math]V^{**}\otimes V^{**}=V\otimes V[/math]. And hence going from [math] V^{**}\otimes V^{**}: V^*\times V^*\to R[/math]

But ok, maybe I am getting confused after 8 hours at work ;-)

Posted
doesn't the definition of the metric tensor depend on which space it lives in?
Yes, certainly, you are right.
I mean [math]g^{ij}(v^*_i,v^*_j)=\alpha \in R[/math] would be a metric tensor of type (2,0) living in [math]V^{**}\otimes V^{**}=V\otimes V[/math]. And hence going from [math] V^{**}\otimes V^{**}: V^*\times V^*\to R[/math]
Well, yes, you are correct, but, though you are rightly identifying a vector space with its double dual, this requires an argument, which is rather subtle.

 

Say what; I had wanted to avoid too much in the way of technicalities. I am willing to be diverted (I just LOVE technicalities!), but that had not been my original intent

Posted
Sorry, yeah I remeber there was some condition for the double dual to be the original space itself.
Actually, Ben's first post seems to be showing us that, if V has a good, "proper" scalar product then it is self-dual. From my foggiest memories though, I think I can seem to vaguely recall that any vector space is equal (or isomorphic, more strictly) to its double-dual.
Posted

Well yes, but I would suggest the term "naturally isomorphic" - which means this isomorphism is independent of a choice of basis. So one may equate, up to a natural isomorphism, a space with its double dual. (Actually, the proof is fiddly, but not that hard, I think)

 

So anyway, so far I have been talking about generic vector spaces.

 

Let's specialize a very great deal, and consider tangent spaces. In order to do this, I need to convince you of a coupla things which might seem like a bit of a digression, more apparent than real, I believe.

 

A manifold [math]M[/math] is very roughly defined as a set of sets -a topological space- that is locally indistinguishable from some [math]R^n[/math] with the standard Euclidean metric. This definition means that for some point [math]m \in M[/math], then we will have an open set [math] U \ni m \subset M[/math] and a mapping [math] h: U \to R^n[/math] with the property that the image of this point can be described as an n-tuple [math]h(m) = (r^1, r^2,.....,r^n) \in R^n[/math].

 

Then, since for any n, the [math]i[/math]-th coordinate function will be the projection [math] R^n \to R,\,\, r^i(h(m)) \in R[/math]. One then defines [math]x^i = r^i \cdot h: U \to R^n \to R[/math] which is simply the real value of the i-th element in our n-tuple, which is itself the image of our point under the mapping [math]h: U \to R^n[/math]. Thus [math]x^i(m) = r^i[/math].

 

This might seem like a very long-winded way of saying something rather elementary, but the important point here is that the set [math]\{x^k\}[/math] is a set of real valued functions called coordinate functions.

 

I now suppose that at each point [math]m \in M[/math] I have a vector space [math]T_mM[/math], defined as [imath] v = \sum_i \alpha^i \frac{\partial}{\partial x^i}[/imath] for all [math]v \in T_mM[/math].

 

So now I seek its dual. Let's call it, somewhat unimaginatively, [math]T_m^{\ast}M[/math] and insist that these have the same properties as any other vector space and its dual.

 

Hmm, I was going to say more tonight, but Mrs. Ben is calling me to bed...who could resist?

 

PS I see on re-reading I have broken my promise of not being technical. Ho, hum, I never could keep a promise.

Posted

So we have the tangent space [math]T_mM[/math] defined, and we now want a basis for our co-tangent space [math]T_m^{\ast}M[/math].

 

Well, we don't have far to look. First notice that for any function [math]f:M \to M[/math] (we require it to be smooth, but let's worry too much about that) we have that, since [math]f[/math] is a 0-form, then [math]df[/math] is a 1-form. It will come in handy later if I write this out.

 

[math]df= \sum \nolimits_i \frac{\partial f}{\partial x^i}dx^i[/math], where the [math]x^i[/math] are our coordinate functions. Since I was at pains to point out the the [math]x^i[/math] are indeed functions, I may replace the [math]f[/math] on the RHS by [math]x^j[/math] and find this becomes [math]\sum \nolimits_i \frac{\partial x^j}{\partial x^i}dx^i[/math], where the quotient in the summand is obviously 1 when i = j, and 0 otherwise (our coordinates are independent by construction, after all).

 

So an easy choice of basis 1-forms for the co-tangent space [math]T_m^{\ast}M[/math] will be the set [math]\{dx^i\}[/math], and [math]\sum \nolimits_i\alpha_i dx^i[/math] will be an arbitrary vector in this space.

 

Let's call this vector as [math]\varphi \in T_m^{\ast}M[/math].

 

Then I will require that, for [math]v = \sum \nolimits_j \beta ^j \frac{\partial}{\partial x^j}[/math] then [math]\varphi(v) \in R[/math].

 

Now, in spite of my huffing and puffing, I am stuck. Since, in the general case, for any bases [math]\{e_i\}[/math] for [math]V[/math] and [math]\{\epsilon^j\}[/math] for [math]V^{\ast}[/math], that [math](e_i, \epsilon^j) = \delta^i_j[/math], I want to argue that

 

[math](\frac{\partial}{\partial x^i}, dx^j) = \delta^i_j[/math] in order to proceed (if I must).

 

But I'm stuck!!

 

*blush*

 

Any thoughts?

Posted

Ok, this is a bit hardcore, I had to look it up in my old GR-course notes.

Consider [math] f[/math] to be a differentiable function on an open set [math] U\subset M[/math] and consider [math] p \in U[/math] to be any point (what you called m before, but I prefer to call it p because m sounds like any point in m and not one in U...but u would be better maybe...). Then for any [math] v\in T_pM[/math] one defines [math] (df)_p(v):=v(f)[/math] and hence it is an application [math] T_pM\to R[/math] and hence [math] (df)_p \in T_p^*M[/math].

 

now in a loval coordinate system and choosing [math] v=\frac{\partial}{\partial x^i}[/math] one has

[math] (df)_p(v)=\frac{\partial f}{\partial x^i} (p)[/math].

So now if one takes [math]f [/math] to be the coordinate function [math] x^j[/math] you get

[math] (dx^j)_p(\frac{\partial}{\partial x^i})=\delta^j_i[/math]

and hence [math] {dx^i}[/math] is the dual basis of [math] {\frac{\partial}{\partial x^i}}[/math].

 

And hence, to go a bit further, you have that

[math] (df)_p[/math]

can be written in its basis as (no surprise and with Einstein summation convention)

[math] (df)_p=\frac{\partial f}{\partial x^i}(p)dx^i[/math]

Which, just to be a bit complete, can be seen by

[math] (df)_p\left(\frac{\partial}{\partial x^j}\right)= \frac{\partial f}{\partial x^i}(p)\underbrace{dx^i\left(\frac{\partial}{\partial x^j}\right)}_{=\delta^i_j}=\frac{\partial f}{\partial x^j}(p)[/math]

 

 

Sad, thing is that about 2.5 years ago I still understood all this no problems and found it not too hard...now, I am no more sure I understand it all...

Posted

Without being too finicky, first thing comes to mind is: as you say, what goes for [imath]f[/imath] goes for [imath]x^i[/imath] too and, of course, the [imath]d[/imath] of [imath]x^i[/imath] is [imath]d(x^i)=dx^i[/imath] so, therefore, it boils down to just [imath]dx^i=\frac{\partial x^i}{\partial x^j}dx^j=\delta^i_jdx^j[/imath] with summation understood, which implies what you want to show.

 

Sad, thing is that about 2.5 years ago I still understood all this no problems and found it not too hard...now, I am no more sure I understand it all...
Yeah, just imagine 15 or more years later...:(
Posted
so, therefore, it boils down to just [imath]dx^i=\frac{\partial x^i}{\partial x^j}dx^j=\delta^i_jx^j[/imath]
Is this typo, or what? You want [math]dx^i = x^j[/math] when i = j? This cannot be right. Assuming you meant [math]dx^i=\frac{\partial x^i}{\partial x^j}dx^j=\delta^i_j dx^j[/math] you merely have that [math] dx^i = dx^i[/math].

 

Sanctus: Your post was very much appreciated, but I was a little troubled by your fiat that [math]df(v) = v(f)[/math], At first I thought you were suggesting that [math]df[/math] is the pull-back of [math]f[/math], and in a vague sort of sense I suppose it could be thought as such. I tried for a derivation of this identity.

 

But first a really big caveat: We are free to treat the set of real numbers as a field, which I denote by [math]\mathbb{R}[/math], or as a vector space ([math]R^1[/math]) or as a manifold ([math]R[/math]) in my notation.

 

Suppose that [math]f:M \to R[/math]. Let us now consider [math]R[/math] as a manifold (it is trivially so). Then, for each tangent space [math]T_cR \equiv R^1[/math] there is a single coordinate function, call it [math]u[/math], the identity function.

 

So we know that under this mapping, for some [math]v \in T_pM,\,\, f(v)=\alpha u \in R^1,\,\,\,df(v) = \alpha \frac{d}{du}[/math], for [math]\alpha \in \mathbb{R}[/math], then one will have that [math]\alpha = df(v(u)) = v(u \cdot f) = v(f)[/math] (since [math]u[/math] is just the identity map on [math]R^1[/math]).

 

Thus [math]df(v) = v((f)(\frac{d}{du})) \Rightarrow df(v) = v(f)[/math]

 

This justifies your fiat.

 

By replacing [math]f[/math] by [math]x^i[/math] we arrive at [math] dx^i(\frac{\partial}{\partial x^j}) = \frac{\partial }{\partial x^j}(x^i) = \frac{\partial x^i}{\partial x^j}=\delta^j_i \in \mathbb{R}[/math] (summation implied).

 

Thus if the set [math]\{\frac{\partial}{\partial x^i}\}[/math] is a basis for [math]T_pM[/math], then the set [math]\{dx^j\}[/math] is a basis for its dual, [math]T_p^{\ast}M[/math]

 

Job done. Time for tea

Posted

Typo corrected! :doh:

Assuming you meant [math]dx^i=\frac{\partial x^i}{\partial x^j}dx^j=\delta^i_j dx^j[/math] you merely have that [math] dx^i = dx^i[/math].
Which is kinda what I meant but I can't remember what I meant to show by it... I've just been having connection problems and I'm just about to toss the pc out the window. :hihi:
Posted

OK, I now want to move to the main point of this thread, but to simplify my life, I want to introduce some notational short-cuts (none of them revolutionary, BTW)

 

I first set [math]\frac{\partial}{\partial x^i} \equiv \partial_i[/math].

 

Second I use the so-called summation convention: if identical indices appear in the up/down position, I assume summation over. Thus, for example, we will have [math]\alpha_i dx^i[/math] as a co-vector, and [math]\beta^j \partial_j[/math] as a vector, and [math]\alpha_i dx^i(\beta^j \partial_j) = \alpha_i \beta^j \delta^i_j = c \in

\mathbb{R}[/math]. This sum is called "contraction". There are some deep implications here, which we can talk about if you want

 

So what follows is merely a way that helps me to see what going on with vectors spaces and their duals.

 

Notice first that, if [math]f: R^3 \to R^3[/math] then, in longhand, one will have that [math]df = \frac{\partial f}{\partial x}dx+ \frac{\partial f}{\partial y}dy + \frac{\partial f}{\partial z}dz[/math]

 

In new money this becomes [math] df = \partial_i(f) dx^i[/math]. This is just a way of writing the gradient. It IS the gradient in Euclidean space with Cartesian coordinates and the usual metric, but care is needed for other coordinate systems and/or other spaces, where one would rather call it an isomorphism.

 

I could ramble at length on the subject of isomorphism vs. equality, so don't get me started!

 

Now take [math]dx^i[/math]. Then since we had the [math]x^i[/math] as 0-forms, we must [math]dx^i[/math] as a 1-form, so let's regard this as a sort of gradient.

 

Now when we look at, say a walker's map, we find the gradient of a hill is given as contour lines. So, by plotting a certain trajectory through these contours, I can, at least in principle, compute the energy I shall thereby expend. Let's say this is some number

 

By thinking of this "trajectory" as a vector, and our 1-form as a gradient, i.e. a set of planes through which our vector is passing, I can visualize the relation [math]\alpha_i dx^i(\beta^j \partial_j) = c[/math] as the "energy" of a vector.

 

I have seen some texts which actually use the term "energy" to describe something akin to length/angle where no metric is defined

Posted
In new money this becomes [math] df = \partial_i(f) dx^i[/math]. This is just a way of writing the gradient.
The actual gradient is only the [math]\partial_if[/math] bit.

 

I could ramble at length on the subject of isomorphism vs. equality, so don't get me started!
Yup, the map is not the territory....:hyper:
Posted
The actual gradient is only the [math]\partial_if[/math] bit.
Actually, I don't think this is quite correct.

 

If you are using the same notational conventions as I am, you have that, say, [math]\nabla f = \frac{\partial f}{\partial x}+ \frac{\partial f}{\partial y}+ \frac{\partial f}{\partial z}[/math]

 

Is this what you mean? This is not the gradient as I understand it.

 

I think, just guessing, you mean that, in terms of components on some arbitrary basis, that [math]\nabla f =(\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z})[/math]. I have no problem with this, but, it is not what my notation implied, for.....

 

.....this is NOT a sum, it is an ordered 3-tuple (of numbers) in this case. In terms of (co)vectors you MUST apply a basis vector to each element in

the n-tuple, and then sum; thus, say [math]\nabla f = \frac{\partial f}{\partial x}\vec{e}_x+ \frac{\partial f}{\partial y}\vec{e}_y+\frac{\partial f}{\partial z}\vec{e}_z[/math].

 

As we will find in any intermediate calculus text.

 

Grrr.

Posted

No, I did not mean the sum, I meant the set of components, indexed by [imath]i[/imath] and which you distinguish from the actual vector. Formally you a correct in making this distinction but still the [imath]dx_i[/imath] are not the basis vectors so [imath]df[/imath] is not the gradient, it's the differential. The former is a vector and the latter isn't.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...