In general, it's hard to detect tampering and it's a whole field of research in digital image forensics. I'll try to summarise some of the key approaches to this problem. What you're talking about is sometimes called image forgery or image tampering. And the copy-paste operation is called image composition or image splicing.
From a practical perspective there are number of different variants to this problem:
- add something to the image
(source)
- removing something from the image
(source)
- changing global properties of the image
(source)
- using one image vs. multiple images e.g. this use of the clone tool:
(source)
- detecting whether if an image has been tampered vs. localising the tampering
- determining the type of tampering
How you solve the problem is going to be very different depending on whether you are involved in a reviewing video surveillance footage, examining a single photo at a court case or running a photo sharing site. The problem is substantially harder if the problem is adversarial and the image manipulation may have been hidden.
Another point is that there is a lot of legitimate postprocessing that happens in images. To take an extreme example new digital camera introduce bokeh and blurring effects even though this is not present in the finished image. So if you are interested in detecting more general types of image manipulation beyond image splicing it's helpful to be aware of what's happening in cameras and apps.
A digital image is acquired on a camera as follows:
scene $\rightarrow$ imaging sensor $\rightarrow$ on camera postprocessing $\rightarrow$ storage
where
- the scene is the external geometry of the image
- the image sensor is a CCD or CMOS
photodetector which converts light into electrical charge
- postprocessing is where the camera is where the electrical charge is
converted into a digital signal and several corrective steps are
taken to account for camera geometry, colour correction, etc.
- storage of is where the finished image written to memory. Often it's
converted into a compressed format such as JPEG and stored along with relevant metadata.
By considering the acquisition process you can see several possible points where tampering will result in inconsistencies in the image:
- physical scene geometry
- sensor and acquisition noise
- postprocessing and compression artifacts
- metadata
Metadata. An obvious thing to look at is the metadata associated with the image, often it can have camera information, time information and possibly location information. All of these can possible identify inconsistency. If you have the statue of Liberty in your image but the GPS coordinates say you are at McMurdo Station in Antartica then the image is probably a forgery. But the metadata is itself easy altered or stripped so this is not reliable.
Sensor noise. Sensor noise can be quite distinctive for digital camera, so much so that it can used to fingerprint the sensors in different camera models. There are several distinct types of noise introduced by sensors in digital cameras, but a very useful kind is photo-response nonuniformity (PRNU). This is a fingerprint associated with sensor noise and postprocessing, and it is robust to several image processing transformations, including lossy compression such as downsampling. You can calculate the PRNU across blocks in the image, and introducing a new element from a different camera will introduce and inconsistency in the image. This seems to work pretty well, but it works best if you know the camera type. It's still possible to estimte PRNU from a single image. Color filter array interpolation should also be consistent across the image, and will be distrupted by splicing.
Compression and processing artifacts. All image processing techniques will leave a trace on the image statistics. Digital images are very commonly compressed via JPEG which compresses things using the discrete cosine transform. This process leaves traces in the image statistics. One interesting technique is to detect JPEG ghosts, that is parts of an image which have been compressed twice via DCT. As you mention, I believe that downsampling will remove some of these artifacts although the downsampling itself will be detectable.
Scene consistency. An image acquire from single source should have consistent perspective (vanishing points), and illumination. Moreover it's hard to fake these fake these with a composite image. I recommend looking through (Redi et al., 2011) for more details here.
Finally, if you say "Okay I give up. There's too many possible method, I just want a detector" you can look at this recent ICCV paper where they train a detector to find where an image has been manipulated. This may give you some more insight into training a blackbox model.
Bappy, Jawadul H., et al. "Exploiting Spatial Structure for Localizing Manipulated Image Regions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
Datasets/Contests:
Casia V1.0 and V2.0 (image splicing)
http://forensics.idealtest.org/
coverage (copy-move manipulations)
https://github.com/wenbihan/coverage
Media Forensics Challenge 2018 (various manipulations, requires registration)
https://www.nist.gov/itl/iad/mig/media-forensics-challenge-2018
IEEE IFS-TC Image Forensics Challenge Dataset. (website currently unavailable)
Raise (raw, unprocessed images along with camera metadata)
http://mmlab.science.unitn.it/RAISE/index.php
Surveys:
Redi, Judith A., Wiem Taktak, and Jean-Luc Dugelay. "Digital image forensics: a booklet for beginners." Multimedia Tools and Applications 51.1 (2011): 133-162.
https://pdfs.semanticscholar.org/8e85/c7ad6cd0986225e63dc1b4264b3e084b3f9b.pdf
Fridrich, Jessica. "Digital image forensics." IEEE Signal Processing Magazine 26.2 (2009).
http://ws.binghamton.edu/fridrich/Research/full_paper_02.pdf
Farid, Hany. Digital Image Forensics: lecture notes, exercises, and matlab code for a survey course in digital image and video forensics.
http://www.cs.dartmouth.edu/farid/downloads/tutorials/digitalimageforensics.pdf
Kirchner, Matthias. Notes on digital image forensics and counter-forensics. Diss. Dartmouth College, 2012.
http://ws.binghamton.edu/kirchner/papers/image_forensics_and_counter_forensics.pdf
Memon, Nasir. "Photo Forensics–There Is More to a Picture than Meets the Eye." International Workshop on Digital Watermarking. Springer, Berlin, Heidelberg, 2011.
Mahdian, Babak, and Stanislav Saic. "A bibliography on blind methods for identifying image forgery." Signal Processing: Image Communication 25.6 (2010): 389-399.
Image Tampering Detection and Localization (includes recent deep learning references)
https://github.com/yannadani/image_tampering_detection_references