I calculated that Rarity (and by extension, Twilight) have about Strength 44, so it follows Big Mac is proportionaly stronger, as an active outdoor human would be to a seamstress or librarian...
So, who wants to help make a D&D fact sheet for MLP Ponies?

If the facts are to be believed, they'd probably have a base strength score of around 46, since Twilight and Rarity are a bit weaker than normal earth ponies. Damage Reduction?
These two intrigued me, and I'm kinda bored, so have a bit of counter-calculation :P

Okay, some rough calcs, based on Tom.

Bobcat calculated ponies as being about 28" high at the shoulder (which I think is about right, comopared to the size of the other animals and stuff like apples).

Looking at Rarty and Tom

Tom is about twice her height. So, let's assume that Tom is a 56" sphere (that's a 28"/0.71m radius). That gives us a rough volume of near as dammit 1.5m³ (V= (4/3)πr3).
One minor point here (I'll leave size aside, despite not really agreeing with it, as it will change little in calculation) - it's not spherical at all, in fact, it looks barely 2/3 of a sphere, so I'll use 1m³ (based on your estimation) to give us nice, conservative and easy to calculate figure.

Taking some sample rock densities, because I'm a necromancer, not a geologist (and thus am not in a position to guess what rock it might have been!):

Granite: 2691 kg/m³
Basalt: 3011 kg/m³
Mica: 2883 kg/m³
Limestone 2611 kg/m³

So they're mostly in the 2.6-3 kg/m³ range.
Again, to be conservative (as scientific method dictates, we're trying to establish lower limit) - what it it wasn't heavy, very dense rock like above, but something like coal? It would even fit Discord's warped sense of humor - both coal and diamond are the same thing, just differently ordered. Plus, I'd assume Rarity knows how much gemstones should weight. So, based on that, here are some coal density figures:

Typical Bulk Density of Coal

  • Anthracite Coal : 50 - 58 (lb/ft3), 800 - 929 (kg/m3)
  • Bituminous Coal : 42 - 57 (lb/ft3), 673 - 913 (kg/m3)
  • Lignite Coal : 40 - 54 (lb/ft3), 641 - 865 (kg/m3)

So, this means that RARITY and TWILIGHT are capable of lifting and carrying between 3.9 and 4.5 TONS.
Based on the above, the load actually would be around 640-930 kg; averaged, it comes about to 3/4 of ton, still impressive load, but nowhere near so.

That is about TEN TIMES the peak of human lifting capacity (and that's lift AND CARRY, not just clean-and-jerk and whatnot). And bear in mind these two are arguably the physically weakest of the Mane Cast (since their jobs don't involve a lot of physical exertion and neither are depicted as being very physically active.)
Counterpoint - what if unicorns can use internal telekinesis to supplement their muscles, much like Jedi Knights? Twilight was easily capable of feats far outweighing any pony physical ability, I'd not assume she is necessarily the weakest if she can harness but a fraction of that to power her muscles. Remember, she placed near top spot in race despite zero formal preparation and little to none physical activity, versus whole town of rural ponies.

Back working to D&D through carrying capacity, assuming it is a heavy load (as small quadrupeds, carrying capacity is equal to medium bipeds), Rarity and Twilight have strength scores of about 44. The same as big DRAGONS.
I disagree about them being small, seeing many creatures smaller than humans are in fact medium, but as you said, let's use small quadruped to get medium biped's carrying capability. Sadly, you forget about one thing. It might not be straight carrying load what if Rarity carrying Tom did this:

A character can lift as much as double his or her maximum load off the ground, but he or she can only stagger around with it. While overloaded in this way, the character loses any Dexterity bonus to AC and can move only 5 feet per round (as a full-round action).

It fits well, and if it's true, then her Str score would need to only be 25, allowing for 1600 pounds of weight carried in such way, impressive, but not nearly so. And:

A character can generally push or drag along the ground as much as five times his or her maximum load. Favorable conditions can double these numbers, and bad circumstances can reduce them to one-half or less.

If what Rarity did was just embellished version of the above, necessary Str drops to 20, impressive but not nearly godlike value.

And, in interest of conservative values - what if Rarity is medium quadruped, if only for purposes of weight load? Then, the scores drop to 22 and 14-15, the latter figure being almost completely in mundane range for someone living in rural village