How to precisely jump to a specific time in AVPlayer

glass-time-watch-business.jpg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260

What’s the issue?

When using AVPlayer method - SeekToTime, I found there’s an issue that I don’t understand: the position of pointed time is not correct. In other word, the return time slot is not exactly what you want.

To me, that’s a serious problem because when doing the multimedia App by showing the subtitles, it would cause the title and speaking sound asynchronously. The whole App would become a bit weird at that moment on the user experience.

WTF.jpgThis should be what users thought about!

Do I have a solution for this?

I’ve tried several ways to get the goal - precisely move the time cursor to the correct position. Unfortunately, I couldn’t find a proper method to achieve this goal. For example:

1
2
3
4
5
[self.player 
seekToTime: CMTimeMakeWithSeconds(musicTime, 600)
toleranceBefore: kCMTimeZero
toleranceAfter: kCMTimeZero
];

Back to the code gist, I think I run into a dead end. We should think about: why 600 is the correct time scale to use? Why this magic number 600?

To answer these questions, I think I should know and explain what a CMTime means.

What’s CMTime?

Let’s see the definition from Apple - CMTime

A struct representing a time value such as a timestamp or duration. Defines a structure that represents a rational time value int64/int32.

So, it’s used to represent a rational number that has a numerator number (an int64 value) and a denominator number (an int32 value). Developers can use those settings to search scene position in Video/Audio Applications. The definition of CMTime structure would be:

1
2
3
4
5
6
7
typedef struct
{
CMTimeValue value;
CMTimeScale timescale;
CMTimeFlags flags;
CMTimeEpoch epoch;
} CMTime

How to create a CMTime?

There are two most-used different categories to create a CMTime in Objective-C.

  • CMTime CMTimeMakeWithSeconds(Float64 seconds, int32_t preferredTimescale);
  • CMTime CMTimeMake(int64_t value, int32_t timescale);

A larger time scale will result in high precision when it’s performing addition and subtraction operations. So, Apple recommends it’s better to use 600 as a time scale value for video. Based on the document: Time and Media Representations

You frequently use a timescale of 600, because this is a multiple of several commonly used frame rates: 24 fps for film, 30 fps for NTSC (used for TV in North America and Japan), and 25 fps for PAL (used for TV in Europe).

So, it’s a common multiple of 24 fps, 25 fps, 30 fps.

Does the time scale equal to frame rate?

I know the description above sounds like a time scale value means the frame rate. So, we can use a 600 frame rate to process the video. It looks like something weird because not every video clip uses then 600 frame rate. Then what does the time scale exactly mean?

Back to the original question, how to indicate a precise time in second? For example, you want to search a scene at 20 mins and 40 seconds in a video, how do you present this? We might show 20:40 in NSTimeInterval that would be 1240.0. That’s a double-precision floating-point number. It seems no question about this. But the problem is the operation of double-precision floating-point numbers would cause inaccuracy in the computer world because we store value in double type, which means in 8 words. So, we don’t use a double-precision floating-point to calculate the time; instead, we use double to calculate, which means the fraction part would be wrong. That’s why Apple provides CMTime structure to solve this issue. Do you remember the structure of CMTime? Here are the definitions of the fields.

1
2
3
4
5
typedef int64_t CMTimeValue;
typedef int32_t CMTimeScale;
typedef enum CMTimeFlags : uint32_t {
...
} CMTimeFlags;

You might remember that I said value is the numerator, and timeScale is the denominator. By using int64 and int32, we can avoid the lack of precision of double operation.

Then what does the time exactly mean?

A time scale means the slice numbers in one second. For example, if you use 600 as time scale value, then that means we slice 600 slots in a second. That means, conceptually, the time scale means frame rate but not the real frame rate in a video clip.

Back to the function.

For CMTimeMake(200, 10), it means there are 200 units, and each unit occupies 1/10 of a second. For CMTimeMakeWithSeconds(200, 10), it means this video clip has 200 seconds, and each unit occupies 1/10 of a second, then how many units can I have? The answer is 2000 units. To get the precise time for the next jump, we need to know what’s the exact time scale value of this video clip:

1
int32_t currentAssetTimeScale = self.avPlayer.currentItem.asset.duration.timescale;

Then how many units we currently have?

1
CMTime nextInterval = CMTimeMake(musicTime, timeScale);

You will get the next start position for playing the video.

1
2
3
4
5
[self.player
seekToTime:nextInterval
toleranceBefore:kCMTimeZero
toleranceAfter:kCMTimeZero
];

You can jump to the correct start position now. Happy coding.

[Reference]

  1. Blog - D2Vision, CMTime for video and audio apps on iOS!
  2. Warren Moore — Understanding CMTime
  3. ios - Trying to understand CMTime - Stack Overflow
  4. Double-precision floating-point format - Wikipedia
  5. CMTimeFlags - Core Media | Apple Developer Documentation
  6. About CMTime