SEL, a State-based Language for Video Surveillance Modeling, is a formal language designed to represent and identify activities in surveillance systems through scenario semantics and the creation of motion primitives structured in programs. Motion primitives represent the temporal evolution of motion evidence. They are the most basic motion structures detected as motion evidence, including operators such as sequence, parallel, and concurrency, which indicate trajectory evolution, simultaneity, and synchronization. SEL is a very expressive language that characterizes interactions by describing the relationships between motion primitives. These interactions determine the scenario’s activity and meaning. An experimental model is constructed to demonstrate the value of SEL, incorporating challenging activities in surveillance systems. This approach assesses the language’s suitability for describing complicated tasks.