Timeline or Code for Cut-Scenes?

I’m using AS3 OOP.

I’m making a game where there’s cut scenes - here’s an example of one:

There’s a slow pan down from the sky to an alley where there’s a guy talking on a phone. There’s a dialogue “bubble” showing what he’s saying. He changes poses a few times while his mouth is animated (like Phoenix Wright for Nintendo DS, or probably most anime). Then it cuts to the person he’s talking to and there’s another dialogue “bubble” showing this person’s speech, as well as mouth animation, etc. Then it cuts back to the first person - mouth animated and speech “bubble”, and the camera pans down the street as it fades to black.

These particular cut scenes may have taken inspiration from the Sly Cooper cut scenes - sort of 2d in a 3d space, and not a lot of complex character animation going on.

Taking into account all these long, slow camera pans, and having a mouth animation MovieClip that loops, do you use the TimeLine for the cut scenes or do you script them? The cut scenes may function similar to Grand Theft Auto, where you’ll have a few characters and they’ll be in relatively the same location each time you talk to them (but not graphically equivalent, mine’s simple 2d). To achieve this on the timeline, I guess I’d just have a MovieClip for each cut scene and just animate each one separately, but that means that there’d be motion tweens spanning like 190 frames, which I don’t really like (as a developer), and also I’d probably end up using frame scripts there for things like stopping, looping thru a few frames a couple of times, setting timers and using scripted tweens at certain places to avoid tweens spanning 190 frames, etc, turning into some hodge-podge, mish-mash mess.

…or do I write long-winded, cumbersome classes for each one, adding each child asset (the person, the background, the speech bubbles, any other “environment” asset), positioning, and having a long set of tweens with listeners to activate new tweens, on and on in one big chain reaction? I could even set all the speech bubbles to be dynamic text. If I wanted to get real “code-snobby” I could even write a little function to make the mouth movements be randomized. But is it worth it? It seems MUCH easier to do it on the TimeLine, but, ugh, I hate those 190-frame long tweens…

Is there a method somewhere in between these two? How would/do you guys do it? These two separate, or a cluckerFust combination of the two, is I guess what I’d normally do, but now that I’m trying to write more structured, Object Oriented code, I’d like things to be a little cleaner.

Also, does using scripted tweens have any serious advantages over TimeLine tweens in regards to .swf file size? (or in regards to anything else)

Any thoughts?