Detecting manipulation (e.g, photo copy-pasting) in images

Question

I am looking for a solution to detect photos that are manipulated with tools such as Photoshop. For a start, I want to detect copy-pasted images.

Any idea how to detect photos that are manipulated by pasting another photo on the top of the original photo?

For example, detecting a photo of an id card with a photo of a face pasted in the place of an original face.

To make it even more difficult, let's assume we down sample the image after pasting the face in place. This will smooth the sharp edges of the pasted image.

Update 1:

1) It seems that compression techniques as well as straight forward cnn training don't work.

2) This is a relevant post

3) This, is a summary of photo forensic methods.

Update 2:

Since there was no real progress in here, I am starting a bounty.

Update 3:

Thank to the bounty and @machine-epsilon, we have a valid answer!

Update 4:

Since this paper came out at ICCV2019, I just add it here.

I would make a blurred diff and look for rings. I would look for fundamentally different textures (small-scale) on either side of the ring when the color is the same. I would look for the ring to happen where it shouldn't be. If you fed that kind of pre-processed data into a deep learner, it should have better results. If you make a better problem statement, i.e. one with several sample inputs/outputs, etcetera, then a good answer might be able to be made. — EngrStudent, Jan 12 '18 at 21:07
Cnns didn't really work. The problem is, when you photoshop somebody's face with another person for example, if it's an id card, then there is a color difference on the borders anyway. Either in the real one or the manipulated one. So that won't work. If manipulator downsamples after manipulation, then the blur diff won't work as well. Also, I am not sure what you mean by the rings. can you explain? — PickleRick, Jan 12 '18 at 21:21
PickleRick - give two demo pictures. Don't make me half-baked invent your problem. I can show what I mean. How do you feel about Python and openCV? — EngrStudent, Jan 12 '18 at 22:17
@EngrStudent alright, I added some examples...let's see if you can detect that... — PickleRick, Jan 12 '18 at 22:51
What a crazy sample. There is the idea of signal to noise ratio, and if the noise gets big enough it can drown out any signal. — EngrStudent, Jan 12 '18 at 22:56
Why snr should be any different? I don't think the snr will give us anything in here. The changes are very slight, Won't change much between real and manipulated. — PickleRick, Jan 12 '18 at 23:00
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/71615/discussion-between-picklerick-and-engrstudent). — PickleRick, Jan 12 '18 at 23:04

MachineEpsilon · Accepted Answer · 2018-01-28T01:36:55.193

In general, it's hard to detect tampering and it's a whole field of research in digital image forensics. I'll try to summarise some of the key approaches to this problem. What you're talking about is sometimes called image forgery or image tampering. And the copy-paste operation is called image composition or image splicing.

From a practical perspective there are number of different variants to this problem:

add something to the image (source)
removing something from the image

(source)

changing global properties of the image (source)
using one image vs. multiple images e.g. this use of the clone tool: (source)
detecting whether if an image has been tampered vs. localising the tampering
determining the type of tampering

How you solve the problem is going to be very different depending on whether you are involved in a reviewing video surveillance footage, examining a single photo at a court case or running a photo sharing site. The problem is substantially harder if the problem is adversarial and the image manipulation may have been hidden.

Another point is that there is a lot of legitimate postprocessing that happens in images. To take an extreme example new digital camera introduce bokeh and blurring effects even though this is not present in the finished image. So if you are interested in detecting more general types of image manipulation beyond image splicing it's helpful to be aware of what's happening in cameras and apps.

A digital image is acquired on a camera as follows:

scene $\rightarrow$ imaging sensor $\rightarrow$ on camera postprocessing $\rightarrow$ storage

where

the scene is the external geometry of the image
the image sensor is a CCD or CMOS photodetector which converts light into electrical charge
postprocessing is where the camera is where the electrical charge is converted into a digital signal and several corrective steps are taken to account for camera geometry, colour correction, etc.
storage of is where the finished image written to memory. Often it's converted into a compressed format such as JPEG and stored along with relevant metadata.

By considering the acquisition process you can see several possible points where tampering will result in inconsistencies in the image:

physical scene geometry
sensor and acquisition noise
postprocessing and compression artifacts
metadata

Metadata. An obvious thing to look at is the metadata associated with the image, often it can have camera information, time information and possibly location information. All of these can possible identify inconsistency. If you have the statue of Liberty in your image but the GPS coordinates say you are at McMurdo Station in Antartica then the image is probably a forgery. But the metadata is itself easy altered or stripped so this is not reliable.

Sensor noise. Sensor noise can be quite distinctive for digital camera, so much so that it can used to fingerprint the sensors in different camera models. There are several distinct types of noise introduced by sensors in digital cameras, but a very useful kind is photo-response nonuniformity (PRNU). This is a fingerprint associated with sensor noise and postprocessing, and it is robust to several image processing transformations, including lossy compression such as downsampling. You can calculate the PRNU across blocks in the image, and introducing a new element from a different camera will introduce and inconsistency in the image. This seems to work pretty well, but it works best if you know the camera type. It's still possible to estimte PRNU from a single image. Color filter array interpolation should also be consistent across the image, and will be distrupted by splicing.

Compression and processing artifacts. All image processing techniques will leave a trace on the image statistics. Digital images are very commonly compressed via JPEG which compresses things using the discrete cosine transform. This process leaves traces in the image statistics. One interesting technique is to detect JPEG ghosts, that is parts of an image which have been compressed twice via DCT. As you mention, I believe that downsampling will remove some of these artifacts although the downsampling itself will be detectable.

Scene consistency. An image acquire from single source should have consistent perspective (vanishing points), and illumination. Moreover it's hard to fake these fake these with a composite image. I recommend looking through (Redi et al., 2011) for more details here.

Finally, if you say "Okay I give up. There's too many possible method, I just want a detector" you can look at this recent ICCV paper where they train a detector to find where an image has been manipulated. This may give you some more insight into training a blackbox model.

Bappy, Jawadul H., et al. "Exploiting Spatial Structure for Localizing Manipulated Image Regions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.

Datasets/Contests:

Casia V1.0 and V2.0 (image splicing) http://forensics.idealtest.org/

coverage (copy-move manipulations) https://github.com/wenbihan/coverage

Media Forensics Challenge 2018 (various manipulations, requires registration) https://www.nist.gov/itl/iad/mig/media-forensics-challenge-2018

IEEE IFS-TC Image Forensics Challenge Dataset. (website currently unavailable)

Raise (raw, unprocessed images along with camera metadata) http://mmlab.science.unitn.it/RAISE/index.php

Surveys:

Redi, Judith A., Wiem Taktak, and Jean-Luc Dugelay. "Digital image forensics: a booklet for beginners." Multimedia Tools and Applications 51.1 (2011): 133-162. https://pdfs.semanticscholar.org/8e85/c7ad6cd0986225e63dc1b4264b3e084b3f9b.pdf

Fridrich, Jessica. "Digital image forensics." IEEE Signal Processing Magazine 26.2 (2009). http://ws.binghamton.edu/fridrich/Research/full_paper_02.pdf

Farid, Hany. Digital Image Forensics: lecture notes, exercises, and matlab code for a survey course in digital image and video forensics. http://www.cs.dartmouth.edu/farid/downloads/tutorials/digitalimageforensics.pdf

Kirchner, Matthias. Notes on digital image forensics and counter-forensics. Diss. Dartmouth College, 2012. http://ws.binghamton.edu/kirchner/papers/image_forensics_and_counter_forensics.pdf

Memon, Nasir. "Photo Forensics–There Is More to a Picture than Meets the Eye." International Workshop on Digital Watermarking. Springer, Berlin, Heidelberg, 2011.

Mahdian, Babak, and Stanislav Saic. "A bibliography on blind methods for identifying image forgery." Signal Processing: Image Communication 25.6 (2010): 389-399.

Image Tampering Detection and Localization (includes recent deep learning references) https://github.com/yannadani/image_tampering_detection_references

Thank you for such a complete answer. I think the only things feasible is to look into the iccv paper. Compression didn't really turn out to work and I'm not sure if the scene consistency would work if what you have in the manipulation area is a photo on an id card that is completely replaced. Another thing I thought about is to investigate very closely the area of the square around the photo and loik for unexpectedly sharp edges and sudden changes. Do u have any comment on this? — PickleRick, Jan 18 '18 at 06:13
Yes, I think looking at colour gradients and edges is suppose to work pretty well as long as the compression quality of the image is high, otherwise you run into problems with compression artefacts. There's a ton of references to different techniques along those lines in that paper by Mahdian and Saic. If it's a digitally photograph of document then I think using the geometry or physics of the scene will be hard (but you may be able to use illumination for instance). I will try to find links to the datasets mentioned in the ICCV paper as they may be useful. — MachineEpsilon, Jan 18 '18 at 06:40
Sorry, I can't find direct links to the data mentioned in the ICCV paper. If you are a researcher maybe try reaching out to the contest organisers. — MachineEpsilon, Jan 18 '18 at 07:03
I'll surely do! I was also wondering why the link is broken. Thank's for all the help! — PickleRick, Jan 18 '18 at 07:04

Detecting manipulation (e.g, photo copy-pasting) in images

1 Answers1