Given the nature of your data I would suggest you investigate the use of exponential smoothing as well as fitting ARIMA type models, especially due to the temporal constraints within your data. Although I wouldn't doubt spatial dependencies exist, I would be abit skeptical about their usefulness in forecasting (in what I would imagine are fairly large areas), especially since any spatial dependency will likely be already captured (at least to a certain extent) in previous observations in the series.
Where the spatial dependencies may be helpful is if you have small area estimation problems, and you can use the spatial dependency in your data to help smooth out your estimations in those noisy geographic regions. This may not be a problem though since you have aggregated data for a full year.
You shouldn't take my word for it though, and should investigate economics literature on the subject and assess various forecasting methods yourself. Its quite possible other variables are useful predictors of future unemployment in similar panel settings.
Edit:
First I'd like to clarify that I did not mean that the OP should simply prefer some type of exponential smoothing over other techniques. I think the OP should assess performance of various forecasting methods using a hold out sample of 1 or 2 time periods. I do not know the literature for forecasting unemployment, but I have not seen any method so obviously superior that others should be dismissed outright in any context.
Kwak mentions a key point I did not consider initially (and Stephan's comment makes the same point very succinctly as well). The panel nature of the data allows one to estimate an auto-regressive compenent in the model much easier than in a single time series. So I would follow his suggestion and consider the A/B estimator a good bet to provide the best forecast accuracy.
I'm still sticking with my initial suggestion though that I am skeptical of the usefulness of the spatial dependence, and one should assess a models predictive accuracy with and without the spatial component. In terms of prediction it is not simply whether some sort of spatial auto-correlation exists, it is whether that spatial auto-correlation is useful in predicting future values independent of past observations in the series.
For simplifying my reasoning, lets denote
$R_{t}$ corresponds to a geographic region $R$ at time $t$
$R_{t-1}$ corresponds to a geographic region $R$ at the previous time period
$W_{t-1}$ corresponds to however one wants to define the spatial relationship for for the neighbors of $R_{t}$ at the previous time period
In this case $R$ is some attribute and $W$ is that same attribute in the neighbors of $R$ (i.e. an endogenous spatial lag.)
In pretty much all cases of lattice areal data, we have a relationship between $R$ and $W$. Two general explanations for this relationship are
1) The General Social Process Theory
This is when there are processes that affect $R$ and $W$ simultaneously that result in similar values with some sort of spatial organization. The support of the data does not distinguish between the forces that shape attributes in a broader scope than the areal units encompass. (I imagine there is a better name for this, so if someone could help me out.)
2) The Spatial Externalities Theory
This is when some attribute of $W$ directly affects an attribute of $R$. Srikant's example of job diffusion is an example of this.
In the context of forecasting, the general social process model may not be all that helpful in forecasting. In this case, $R_{t-1}$ and $W_{t-1}$ are reflective of the same external shocks, and so $W_{t-1}$ is less likely to have exogenous power to predict $R_{t}$ independent of $R_{t-1}$.
IMO the spatial externalities case I would expect $W_{t-1}$ to have a greater potential to forecast $R_{t}$ independent of $R_{t-1}$ in the short run because $R_{t-1}$ and $W_{t-1}$ can be reflective of different external shocks to the system. This is my opinion though and you typically can't distinguish between the general social process model and the spatial externalities model through empirical means in a cross sectional design (they are probably both occurring to a certain extent in many contexts). Hence I would attempt to validate its usefulness before simply incorporating it into the forecast. Better knowledge of the literature and social processes would definately be helpful here to guide your model building. In criminology only in a very limited set of circumstances does the externalities model make sense (but I imagine it is more likely in economics data). Models of spatial hedonic housing prices often show very strong spatial effects, and in that context I would expect the spatial component to have a strong ability to forecast housing prices. (I like Luc Anselin's explanation of these two different processes better than mine in this paper, PDF here)
Often how we define $W$ is a further problem in this setting. Most conceptions of $W$ are very simplistic and probably aren't entirely reflective of real geographic processes. Here kwaks suggestion of adding a random component to the $W$ effect for each $R$ makes alot of sense. An example would be we would expect New York City to influence its neighbors, but we wouldn't expect NYC's neighbors to have all that much influence on NYC. This still doesn't solve how to either decide what is a neighbor or how to best represent the effects of neighbors. What kwak suggests is essential a local version of Geary's C (spatial differences), local Moran's I (spatial averages) is a common approach as well.
I'm still alittle surprised at the negative responses to my suggestion to use simpler smoothing methods (even if they are meant for univariate time series). Am I naive to think exponential smoothing or some other type of moving window technique won't perform at least comparably well enough to more complicated procedures to assess it? I would be more worried if the series were such that we would expect seasonal components, but that is not the case here.