This paper proposes a video text tracking and text stroke extraction method under complex background. A point matching method based on Harris corner features is introduced to track text objects. The performance is evaluated by the comparison with SSD-based tracking method. A multi-frame-based foreground/background recognition algorithm is proposed to extract text strokes for optical character recognition. The efficiency and robustness of the point matching method for video text tracking and the text stroke extraction algorithm are proved by objective and thorough experiments on TV serials and movies.