It's important to understand that there are two aspects to thread safety:
(1) Execution control, and
(2) Memory visibility.
First, we need to focus on controlling when the code executes and check whether the code can execute concurrently. Second, we need to look that shared variable is accessed from main memory. Because each CPU has several levels of cache between it and main memory, threads running on different CPUs or cores can see "memory" differently at any given moment in time because threads are permitted to obtain and work on private copies of main memory.
Using synchronized prevents any other thread from obtaining the monitor (or lock) for the same object, thereby preventing any and all code protected by synchronization on the same object from ever executing concurrently.
Importantly, synchronization also creates a "happens-before" memory barrier, causing a memory visibility constraint such that anything that is done after some thread acquires a lock appears to another thread subsequently acquiring the same lock to have happened before that other thread acquired the lock.
On current hardware, this typically causes flushing of the CPU caches when a monitor is acquired and writes to main memory when it is released, both of which are expensive.
Using volatile, on the other hand, forces all accesses (read or write) to the volatile variable to occur to main memory, effectively keeping the volatile variable out of CPU caches.
This can be useful for some actions where it is simply required that visibility of the variable is correct and order of accesses is not important.
Using volatile also changes the treatment of long and double to require accesses to them to be atomic; on some (older) hardware this might require locks, though not on modern 64-bit hardware.